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WO 98/28431 PCT/GB97/03546 
TRANSCRIPTIONAL REGULATION IN PLANTS 

Field of the invent: ion 
5 The present invention relates to meiosis in plants and 

more particularly the identification of regulatory 
sequences for meiosis-specif ic transcription of genes of 
interest and uses thereof. 

10 Background of the invention 

Meiosis provides a mechanism by which a heterozygous 
individual can create large numbers of genotypically unique 
recombinant gametes. Chromosomes replicate during 
interphase, as in mitosis, and enter meiosis with two 

15 chromatids. During meiotic prophase I, chromosomes condense 

from the dispersed state typical of interphase, to form 
long thin threads in leptotene, and each acquires a 
proteinaceous axial core to which the two sister chromatids 
are attached. During zygotene, homologous chromosomes 

20 become aligned, forming the synaptonemal complex and, at 

pachytene, non- sister chromatids of the completely paired 
chromosomes recombine forming the chiasmata which become 
visible during diplotene. Two cell divisions follow - 
reductional and equational - resulting in four gametes, 

25 with each single chromosome as a potentially recombinant 

chromatid. 

In yeast, molecular genetic analysis has revealed and 
led to the isolation of several genes that are essential 
for meiosis (Mitchell, 1995) . For example, the DMC1 (Bishop 

30 et al., 1992) and RAD51 (Shinohara et al., 1992) genes are 

homologues of the E. coli recA gene, and appear to play a 
role not only in recombination-mediated homology-dependent 
pairing, but also in the strand exchange that results in 
chiasmata. Other genes, such as ZIP1, are required for 

35 synaptonemal complex formation (Sym et al., 1993 ). 

In higher eukaryotes, molecular analysis of the 
mechanisms controlling chromosomal pairing has been 
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significantly more difficult than in yeast, since the 
systems are more complex and less easy to manipulate. 
However, several meiotic mutations have been identified in 
Drosophila (for review see Carpenter, 1993; 1994) and 
5 higher plants (Sears, 1976; Golubovskaya, 1989; Curtis and 

Doile, 1992; Golubovskaya et al . , 1992, 1993; Tascheto and 
Pagliarini, 1993) and studied for their effect on meiosis. 
One such plant gene, Phi, suppresses pairing of 
nonhomologous chromosomes in wheat and a Phi -mutant 
10 background was used for the transfer and introgression of 

alien chromosome segments into wheat (Riley et al . , 1968). 

Plants have also provided an excellent cytological 
system for the study of meiosis (Gillies, 1984; Albini and 
Jones, 1987, 1988; Sherman and Stack, 1992; Albini, 1994; 
15 Schwarzacher and Heslop-Harrison, 1995) , but there has been 

little investigation at the molecular level. 

Lily anthers have offered the best system to study 
biochemical events that are correlated with the different 
stages of meiosis, because these stages are protracted and 
20 synchronised in adjacent flower buds, permitting the 

isolation of temporally regulated cDNA clones (Kobayashi et 
al . , 1994). One such gene, LIM15, was expressed 
specifically in prophase I of meiotic cells and is 
extremely homologous to the yeast DMC1 gene (Kobayashi et 
25 al., 1993). 

In both yeast and lily, DMC1 and RAD51 proteins 
colocalize during zygotene (Bishop, 1994; Terasawa et al., 
1995) . Early meiosis cDNA clones were also identified in 
wheat and maize by hybridisation to a Lilium meiosis- 
30 specific cDNA clone (Ji and Langridge, 1994) . 

The first genomic sequence of a recA-like plant gene, 
ArLlMl 5 , with high degree of homology to that of LIM15, was 
recently described in Arabidopsis thaliana (Sato et al . , 
1995) . However, no data showing the expression pattern of 
"35 ArLIMIS, as well as no characterization of meiosis-specif ic 

promoter have so far been reported by others. Sequence of 
upstream region of ArLIM15 gene, shown in Figure 4 of 
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Klimyuk and Jones 1996 (see below) contains predominantly 
the sequence of transposon-like element, Limpetl, (1874 bp) 
and only 260 bp of the promoter region, which is not 
sufficient to confer meiosis- specific expression of 
5 reporter gene. 

For many applications it will be useful to drive 
transcription of different genes in specific parts of the 
plant at specific developmental stages. The promoters of 
plant meiotic genes are of extreme interest as they may 

10 provide transcriptional regulation of their genes during 

this very restricted developmental period. The isolation 
and characterization of these promoters enables the study 
and modification of fundamental processes taking place 
during early sporogenesis, as well as to study the impact 

15 of such modifications on more advanced stages of sexual 

reproduction in plants. Work by the present inventors 
outlining their objectives has been shown as a poster 
display (Klimyuk, V.I. et al. "The isolation and 
characterisation of the meiosis-specif ic Arabidopsis 

20 thaliana DMC1 gene". Abstracts of the 6th International 

Conference on Arabidopsis Research. June 7-11, 1995, 
Madison, Wisconsin, USA and Klimyuk, V.I. and Jones, J.D.G. 
"Identification of a transposon-like element, Limpet 1, in 
Arabidopsis thaliana". Abstracts of the 7th International 

25 Conference on Arabidopsis Research. June 23-27, 1996, 

Norwich, UK) and their work has been presented in a paper 
published after the priority date of the present 
application (Klimyuk, V.I. and Jones, J.D.G. "AtDMCl, the 
Arabidopsis homologue of the yeast DMC1 gene: 

30 characterization, transposon- induced allelic variation and 

meiosis-associated expression". Plant J (1997) , 11 (1) , 1-14) 
which is incorporated herein by reference. 

Summary of the invention 
35 The primary aim of the inventors was to identify and 

isolate a plant promoter which confers meiosis-specif ic 
transcriptional regulation in plants. 
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They used the homologies between the lily LIM15 and 
the yeast DMC1 genes to design degenerate PCR (polymerase 
chain reaction) primers that amplified the Arabidopsis 
meiosis-specific DMC1 gene (designated AtDMCl) . The AtDMCl 
gene was completely sequenced and the transcript was 
characterised by RTfreverse transcriptase) -PCR. In situ 
hybridisation analysis showed that AtDMCl expression is 
restricted to pollen mother cells in anthers and megaspore 
mother cells in ovules. A translational fusion was made 
between the AtDMCl promoter and the GUS reporter gene. 
Transgenic Arabidopsis carrying the AtDMCl promoter: GUS 
reporter gene fusion initiated GUS expression at the time 
of meiosis in both male and female lineages. 

Sequence comparison of AtDMCl and ArLIMIS (Sato et 
al., 1995) revealed that these genes, isolated from 
Landsberg erect a and Columbia ecotypes respectively, encode 
the same protein but are different in their promoter 
regions. We determined that the difference was caused by 
the insertion of a 1874 bp transposon-like element, 
designated Limpetl, into the promoter of ArLIM15 . This 
finding revealed that previously published putative 
promoter region of ArLIMIS (Sato et al., 1995) 
predominantly contains Limpet 1 sequences and would not be 
expected to be sufficient to confer meiosis-specific 
expression of the GUS reporter gene. The sequence of 
functional AtDMCl promoter as well as the alignment of the 
promoter regions of the AtDMCl and ArLIM15 genes upstream 
of their transcription start sites are shown in Figure 4A 
and B, respectively. 

Reference herein to the AtDMCl promoter being used in 
various ways, in accordance with the present invention, 
should be taken to be reference to a promoter including all 
or part of the promoter sequence shown in Figure 4 (A) , or 
a variant or derivative thereof, but excluding the promoter 
of AtLIM15 (Sato et al . , 1995). A part (fragment), variant 
or derivative of the promoter sequence shown should be 
sufficient to confer meiosis-specific expression on a 
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heterologous sequence operatively linked, i.e. under the 
control of, the part, variant or derivative of the sequence 
shown. One or more fragments of the sequence may be 
included in a promoter according to the present invention, 
for instance one or more motifs may be coupled to a 
"minimal" promoter. Such motifs may confer meiosis- 
specific promoter function on a promoter which otherwise 
drives expression in a non-meiosis specific fashion. 

According to a first aspect, the present invention 
provides a nucleic acid isolate comprising a promoter as 
indicated. 

In a second aspect, the present invention provides a 
nucleic acid isolate comprising a promoter, the promoter 
comprising a sequence of nucleotides shown in Figure 4 (A) 
and conferring meiosis specific expression on a sequence 
operably linked to the promoter. Restriction enzyme or 
nucleases may be used to digest the full-length nucleic 
acid shown, followed by an appropriate assay to determine 
the minimal sequence required for developmental 
specificity. A preferred embodiment of the present 
invention provides a nucleic acid isolate with the minimal 
nucleotide sequence shown in Figure 4 (A) required for such . 
specificity. 

The promoter may comprise one or more sequence motifs 
or elements conferring developmental and/or tissue-specific 
regulatory control of expression. Other regulatory 
sequences may be included, for instance as identified by a 
mutation or digestion assay in an appropriate expression 
system or by sequence comparison with available 
information, e.g. using a computer to search on-line 
databases. 

By "promoter" is meant a sequence of nucleotides from 
which transcription of DNA operably linked downstream (i.e. 
in the 3' direction on the sense strand of double- stranded 
DNA) may be initiated. 

"Operably linked" means joined as part of the same 
nucleic acid molecule, suitably positioned and oriented for 
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transcription to be initiated from the promoter. DNA 
operably linked to a promoter is "under transcriptional 
initiation regulation" of the promoter. 

The present invention extends to a promoter which has 
5 a nucleotide sequence which is allele, mutant, variant or 

derivative, by way of nucleotide addition, insertion, 
substitution or deletion of a promoter sequence as provided 
herein. Systematic or random mutagenesis of nucleic acid 
to make an alteration to the nucleotide sequence may be 

10 performed using any technique known to those skilled in the 

art. One or more alterations to a promoter sequence 
according to the present invention may increase or decrease 
promoter activity, or increase or decrease the magnitude of 
the effect of a substance able to modulate the promoter 

15 activity. 

"Promoter activity" is used . to refer to ability to 
initiate transcription. The level of promoter activity is 
quantifiable for instance by assessment of the amount of 
mRNA produced by transcription from the promoter or by 

20 assessment of the amount of protein product produced by 

translation of mRNA produced by transcription from the 
promoter. The amount of a specific mRNA present in an 
expression system may be determined for example using 
specific oligonucleotides which are able to hybridise with 

25 the mRNA and which are labelled or may be used in a 

specific amplification reaction such as the polymerase 
chain reaction. Use of a reporter gene facilitates 
determination of promoter activity by reference to protein 
production. 

30 Therefore there is provided a nucleic acid molecule 

comprising 

(a) the Arabidopis meiosis-specif ic DMC1 (AtDMCl) 
gene promoter; or 

(b) a Tfteiosis- specif ic promoter homologous to (a) 
35 but from another plant species; or 

(c) a meiosis-specif ic promoter of the gene of a 
homologous DMC1 protein from another plant species; or 
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(d) a meiosis -specific promoter variant, mutant, 
allele or derivative of (a) , (b) or (c) ; or 

(e) a portion of (a), (b) , (c) or (d) sufficient 
to confer meiosis-specif ic character to a promoter 

5 containing it. 

In various embodiments of the present invention a 
promoter which has a sequence that is a fragment, mutant, 
allele, derivative or variant, by way of addition, 
insertion, deletion or substitution of one or more 
10 nucleotides, of the sequence of the promoter shown in 

Figure 4 (A) , has homology with the shown sequence which is 
at least about 5% greater than the homology that the 
ArLIMIS promoter sequence (Sato et al . ) has with the 
( sequence shown herein, preferably at least about 10% 

15 greater homology, more preferably at least about 20% 

homology, more preferably at least about 2 5% greater 
homology. The sequence in accordance with an embodiment of 
the invention may hybridise with the shown sequence but not 
the ArLIM15 promoter sequence under appropriately stringent 
20 selective hybridisation conditions. A promoter according 

to the invention may include one or more motifs that appear 
in Figure 4 (A) and are able to confer meiosis-specif icity 
on a promoter which contains them, and which are not 
present in the ArLIMIS promoter of Sato et al . 
25 The present invention also includes meiosis-specif ic 

promoters that are homologous to the AtDMCl gene promoter. 
Further, the present invention includes meiosis-specif ic 
promoters of the gene of a homologous DMC1 protein from 
another plant species . An homologous promoter or nucleic 
30 acid encoding an homologous DMC1 protein may show greater 

than 55% homology with the sequence of Fig. 4A or Fig. 5A 
or Fig. 5B, greater than 65% homology, greater than 75% 
homology, greater than 85% homology or greater than 95% 
homology. Such homology may be shown for. a sequence of at 
35 least 20 nucleotide bases, at least 50 nucleotide bases, at 

least 100 nucleotide bases, at least 300 nucleotide bases 
or at least 500 nucleotide bases. 
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Further provided by the present invention is a nucleic 
acid construct comprising a promoter region or a fragment, 
mutant, allele, derivative or variant thereof as discussed 
able to promote transcription in a plant in a meiosis- 
specific manner, operably linked to a heterologous nucleic 
acid sequence, preferably a gene, e.g. a coding sequence. 
By "heterologous" is meant a gene other than the AtDMCl 
coding sequence. Modified forms of AtDMCl coding sequence 
may be excluded. Generally, the gene may be transcribed 
into mRNA which may be translated into a peptide or 
polypeptide product which may be detected and preferably 
quant itated following expression. A gene whose encoded 
product may be assayed following expression is termed a 
"reporter gene", i.e. a gene which "reports" on promoter 
activity. 

The present invention also provides a nucleic acid 
vector comprising a promoter as disclosed herein. Such a 
vector may comprise a suitably positioned restriction site 
or other means for insertion into the vector of a sequence 
heterologous to the promoter to be operably linked thereto. 

Suitable vectors can be chosen or constructed, 
containing appropriate regulatory sequences, including 
promoter sequences, terminator fragments, polyadenylation 
sequences, enhancer sequences, marker genes and other 
sequences as appropriate. For further details see, for 
example. Molecular Cloning: a Laboratory Manual: 2nd 
edition, Sambrook et al, 1989, Cold Spring Harbor 
Laboratory Press. Procedures for introducing DNA into 
cells depend on the host used, but are well known. 

Thus, a further aspect of the present invention 
provides a host cell (which may be microbial or plant) 
containing a nucleic acid construct comprising a promoter 
element, as disclosed herein, operably linked to a 
heterologous nucleic acid sequence or gene. A still 
further aspect provides a method comprising introducing 
such a construct into a host cell. The introduction may 
employ any available technique well known to the person 
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skilled in the art. 

The introduction may be followed by causing or 
allowing expression of the heterologous nucleic acid 
sequence or gene under the control of the promoter. 
5 In one embodiment, the construct comprising promoter 

and nucleic acid sequence or gene integrated into the 
genome (e.g. chromosome) of the host cell. Integration may 
be promoted by including in the construct sequences which 
promote recombination with the genome, in accordance with 

10 standard techniques. 

Many known techniques and protocols for manipulation 
of nucleic acid, for example in preparation of nucleic acid 
constructs, mutagenesis, sequencing, introduction of DNA 
into cells and gene expression, and analysis of proteins, 

15 are described in detail in Short Protocols in Molecular 

Biology, Second Edition, Ausubel et al . eds., John Wiley & 
Sons, 1992, the disclosure of which is incorporated herein 
by reference. 

Nucleic acid molecules, constructs and vectors 

20 according to the present invention may be provided isolated 

and/or purified (i.e. from their natural environment), in 
substantially pure or homogeneous form, free or 
substantially free of a coding sequence, or free or 
substantially free of nucleic acid or genes of the species 

25 of interest or origin other than the promoter sequence. 

Nucleic acid according to the present invention may be 
wholly or partially synthetic. The term "isolate" 
encompasses all these possibilities. 

An aspect of the present invention is the use of 

30 nucleic acid according to the invention in the production 

of a transgenic plant. 

When introducing a chosen gene construct into a cell, 
certain considerations, well known to those skilled in the 
art must be taken * ir into. account . The nucleic acid to be 

35 inserted should be assembled within a construct which 

contains effective regulatory elements which will drive 
transcription. There must be available a method of 
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transporting the construct into the cell. Once the 
construct is within the cell membrane, integration into the 
endogenous chromosomal material either will or will not 
occur. Finally, as far as plants are concerned the target 
5 cell type must be such that cells can be regenerated into 

whole plants. 

Plants transformed with the DNA segment containing the 
sequence may be produced by standard techniques which are 
already known for the genetic manipulation of plants. DNA 

10 can be transformed into plant cells using any suitable 

technology, such as a disarmed Ti-plasmid vector carried by 
Agrobacterium exploiting its natural gene transfer ability 
(EP-A-270355, EP-A-0116718 , NAR 12(22) 8711 - 87215 1984), 
particle or microprojectile bombardment (US 5100792, EP-A- 

15 444882, EP-A-434616) microinjection (WO 92/09696, WO 

94/00583, EP 331083, EP 175966, Green et al . (1987) Plant 
Tissue and Cell Culture, Academic Press) , electroporation 
(EP 290395, WO 8706614) other forms of direct DNA uptake 
(DE 4005152, WO 9012096, US 4684611), liposome mediated DNA 

20 uptake (e.g. Freeman et al . Plant Cell Physiol. 29: 1353 

(1984)), or the vortexing method (e.g. Kindle, PNAS U.S.A. 
87: 1228 (1990d) Physical methods for the transformation of 
plant cells are reviewed in Oard, 1991, Biotech. Adv. 9: 1- 
11. 

25 Agrobacterium transformation is widely used by those 

skilled in the art to transform dicotyledonous species. 
Recently, there has been substantial progress towards the 
routine production of stable, fertile transgenic plants in 
almost all economically relevant monocot plants (Toriyama, 

30 et al. (1988) Bio/Technology 6, 1072-1074; Zhang, et al. 

(1988) Plant Cell Rep. 7, 379-384; Zhang, et al . (1988) 
Theor Appl Genet 16, 835-840; Shimamoto, et al . (1989) 
Nature 338, 274-276; Datta, et al. (1990) Bio/Technology 8, 
736-740; Christou, et al. (1991) Bio/Technology 9, 957-962; 

35 Peng, et al. (1991) International Rice Research Institute, 

Manila, Philippines 563-574; Cao, et al . (1992) Plant Cell 
Rep. 11, 585-591; Li, et al. (1993) Plant Cell Rep. 12, 
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250-255; Rathore, et al . (1993) Plant Molecular Biology 21, 
871-884; Fromm, et al . (1990) JBio/Tecfznology 8, 833-839; 
Gordon-Kamm, et al. (1990) Plant Cell 2, 603-618; 
D'Halluin, et al. (1992) Plaiit Cell 4, 1495-1505; Walters, 
et al. (1992) Plant Molecular Biologry 18 r 189-200; Koziel, 
et al. (1993) Biotechnology 11, 194-200; Vasil, I. K. 
(1994) Plant Molecular Biology 25, 925-937; Weeks, et al. 
(1993) Plant Physiology 102, 1077-1084; Somers, et al . 
(1992) Bio/Technology 10, 1589-1594; W092/14828) . In 
particular, AgroJbacterium mediated transformation is now 
emerging also as an highly efficient alternative 
transformation method in monocots (Hiei et al. (1994) The 
Plant Journal 6, 271-282) . 

The generation of fertile transgenic plants has been 
achieved in the cereals rice, maize, wheat, oat, and barley 
(reviewed in Shimamoto, K. (1994) Current Opinion in 
Biotechnology 5, 158-162.; Vasil, et al . (1992) 
Bio/Technology 10, 667-674; Vain et al . , 1995, 
Biotechnology Advancesl3 (4): 653-671; Vasil, 1996, Nature 
Biotechnology 14 page 702) . 

Microprojectile bombardment, electroporation and 
direct DNA uptake are preferred where Agrobacterium is 
inefficient or ineffective. Alternatively, a combination 
of different techniques may be employed to enhance the 
efficiency of the transformation process, eg bombardment 
with Agrobacterium coated microparticles (EP-A-486234) or 
microprojectile bombardment to induce wounding followed by 
co-cultivation with Agrobacterium (EP-A-486233) . 

Following transformation, a plant may be regenerated, 
e.g. from single cells, callus tissue or leaf discs, as is 
standard in the art. Almost any plant can be entirely 
regenerated from cells, tissues and organs of the plant. 
Available techniques are reviewd in Vasil et al . , Cell 
Culture and Somatic Cel genetics of Plants, Vol I, II and 
III, Laboratory Procedures and Their Applications, Academic 
Press, 1984, and Weissbach and Weissbach, Methods for Plant 
Molecular Biology, Academic Press, 198 9. 
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The particular choice of a transformation technology 
will be determined by its efficiency to transform certain 
plant species as well as the experience and preference of 
the person practising the invention with a particular 
5 methodology of choice. It will be apparent to the skilled 

person that the particular choice of a transformation 
system to introduce nucleic acid into plant cells is not 
essential to or a limitation of the invention, nor is the 
choice of technique for plant regeneration. 

10 Also according to the invention there is provided a 

plant cell having incorporated into its genome nucleic 
acid, particularly heterologous nucleic acid, as provided 
by the present invention. A further aspect of the present 
invention provides a method of making such a plant cell 

15 involving introduction of a vector including the sequence 

of nucleotides into a plant cell and causing or allowing 
recombination between the vector and the plant cell genome 
to introduce the sequence of nucleotides into the genome. 
The invention extends to plant cells containing nucleic 

20 acid according to the invention as a result of introduction 

of the nucleic acid into an ancestor cell. 

Plants which include a plant cell according to the 
invention are also provided, along with any part or 
propagule thereof, seed, selfed or hybrid progeny and 

25 descendants. A plant according to the present invention 

may be one which does not breed true in - one or more 
properties. Plant varieties may be excluded, particularly 
registrable plant varieties according to Plant Breeders' 
Rights. It is noted that a plant need not be considered a 

30 w plant variety" simply because it contains stably within 

its genome a transgene, introduced into a cell of the plant 
or an ancestor thereof. 

In addition to a plant, the present invention provides 
any clone of such a plant, seed, selfed or hybrid progeny 

35 and descendants, and any part of any of these, such as 

cuttings or seeds. The invention provides any plant 
propagule, that is any part which may be used in 
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reproduction or propagation, sexual or asexual, including 
cuttings, seed and so on. Also encompassed by the 
invention is a plant which is a sexually or asexually 
propagated off -spring, clone or descendant of such a plant, 
5 or any part or propagule of said plant, off-spring, clone 

or descendant . 

The AtDMCl promoter may be used in meiosis- specif ic 
expression of a heterologous sequence, including a variety 
of cytotoxic genes, eg ribosome inactivating proteins (Lord 

10 et al., 1991) or cytotoxic RNase barnase (Goldman et al . , 

1994) , DNA modifying enzymes including rare -cut ting site- 
specific endonucleases, eg Xbal , I-Scel, HO endonuclease 
(Brenneman et al . , 1996; Puchta et al . , 1996; Chiurazzi et 
al., 1996; Haber, 1995), or recombinases , eg FLP 

15 recombinase (Kilby et al . , 1995), recombinase from 

Zygosacharomyces rouxii (Onouchi et al . , 1995), 
bacteriophage PI ere recombinase (Osborne et al . , 1995) as 
well as different transcription factors (Meshi & Iwabuchi, 
1995; Ramachandran et al . , 1994), protein kinases and 

20 phosphatases (Stone & Walker, 1995) , cell cycle regulators 

(Ferreira et al . , 1994; Dahl et al . , 1995), which are 
normally not expressed during the time of meiosis. This 
may serve different purposes, such as ablation of meiotic 
cells and isolation of apomictic plants, designing an 

25 efficient homologous recombination system for plants, 

increasing meiotic recombination frequency, for 
introgression of alien chromosome segments into host plant, 
or altering normal events of cell cycle during the time of 
meiosis and producing male and female sterile plants. 

30 Easily assayed reporter genes, eg GUSA (Jeff er son, 1987) or 

GFP (Sheen et al . , 1995) under control of AtDMCl promoter 
may be used as markers for detection of early meiotic 
events in plants, especially for the analysis of different 
mutations affecting meiosis. 

35 The AtDMCl promoter may be used for improving 

transposon tagging efficiency. The main factors determining 
the efficiency of transposon-based system in the tagging of 
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host genes are the frequencies of transposon excision and 
reinsertion, and the independence of transposition events. 
The last factor is crucial for a system with high level of 
excision and reinsertion events, as such system often yield 
clonal transpositions. Using the AtDMCl promoter, driving 
the transpositions at the restricted stages of plant 
development, particularly at the early stages of meiosis, 
improves the efficiency of transposon tagging by producing 
only unique (independent) transposition events. 

Therefore, the present invention provides a method of 
transposon tagging comprising the steps of creating 
construct comprising a promoter as described above operably 
linked to a transposase required for transposition of a 
transposable element; transforming a plant cell with said 
15 construct such that the transposase is driven by said 

promoter; and determining transposition events. 

The AtDMCl promoter may be used in searching for 
apomictic mutants or used by seed producers to produce 
seeds apomictically . Apomixis is the definition of asexual 
20 reproductive processes that occur in ovules of flowering 

plants (Koltunow, 1993) . Mutants with fertilisation- 
independent seed development have been recently described 
for Arabidopsis (Ohad et al., 1996; Chaudhury et al.,1997). 
Apomictically produced plants are genetically identical 
25 with the female parental plants. Apomictic reproduction may 

therefore be beneficial for agriculture, as. it may be an 
inexpensive way to preserve given genotype through 
successive generations. 

Ablation of meiotic cells or changing their fate 
30 through altering events of meiotic cell cycle can be 

extremely beneficial for apomictic seed production, as it 
will eliminate sexually produced seeds from the progeny. 
Thus, the only seeds that will survive will be apomictic. 
Schematic presentation of an exemplary construct 
35 suitable for searching for apomictic mutants as well as for 

the apomictic seeds production is shown in Figure 1A. The 
construct carries meiosis-specif ic promoter (msp) fused 
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through a DNA insertion sequence (DNA ins) to a 
modification nucleic acid sequence, preferably a gene (tx) 
encoding a cytotoxic protein. The DNA insertion sequence 
may be any DNA sequence (another gene, eg transformation 
5 marker or counter selectable marker) which does not confer 

transcription of cytotoxic gene and is sufficiently long to 
prevent transcription of cytotoxic gene from the meiosis- 
specific promoter. DNA insertion is preferably flanked with 
DNA sequences which are the target sequences (ts) for an 

10 activator, preferably a DNA recombinase (r-se) , under 

control of inducible promoter (ip) . The target sequences 
may be, for example, lox recombination sites with the same 
orientation in the case of the ere recombinase, or the 5' 
and 3' Ac ends for the Ac transposase. 

15 The transcription of DNA recombinase gene may be 

controlled by an inducible promoter (ip) , for example, 
promoter of heat shock inducible gene (Yoshida et al., 
1995) . Optionally, any other inducible system may be used, 
for example, copper- controllable (Mett et al . , 1993), or 

20 glucocorticoid-controllable (Aoyama et al . , 1995) gene 

expression system, but in this case the constructs design 
will be more complex, as such systems are transcription 
factor mediated. 

Induction of the recombinase or transposase gene 

25 expression will lead to the elimination of the DNA 

insertion due to the homologous recombination between two 
flanking target sequences in the case of a cre/loxP- like 
system (one target sequence will be left behind) or 
complete excision of the DNA insertion together with both 

30 target sequences in case of Ac/Ds- like system. This will 

lead to the meiosis-specif ic transcription of cytotoxic 
gene and, consequently, meiotic cells death. For some 
powerful cytotoxins (barnase) the low expression level of 
inhibitor protein (txi) tbarstar) would be an advantage, as 

35 it may prevent negative effect of possible leakage in 

cytotoxic gene expression. 

This system may be useful, firstly for apomictic seed 
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production, as the induction of recombinase gene will lead 
to the ablation meiotic cells and viable seeds may be 
produced as the result of apomictic reproduction; and 
secondly, for the detection of apomictic mutants, 
5 especially in genotypes which are predisposed to apomixis 

when reproduction is not successful, e.g. citrus and grass 
species (Koltunow, 1993) . In this case, seeds or plants or 
plant parts eg pollen carrying a construct as described in 
Figure 1A, may be mutagenised and the site- specific 

10 recombinase gene expression can be induced in self-progeny 

derived from mutagenised seeds. In this situation viable 
seeds will only be produced apomictically in an appropriate 
mutant background. 

For increasing the efficiency of the system it may be 

15 an advantage to use easily screenable counter selectable 

marker, for example a bacterial cytochrome P450 (O'Keefe et 
al . , 1994) or phosphonate monoester hydrolase (pehA) 
(Dotson et al., 1996) as the DNA insertion (DNA ins) (see 
Fig. 1A) . These genes catalyse conversion of pro-herbicides 

20 into herbicides and any seedlings (potentially false 

positives) with inactive cytotoxic gene (tx) (because of 
the presence of DNA insertion) will be easily removed by 
the treatment appropriate pro-herbicide. 

In this case the use of cre/lox -like system (i.e. one 

25 involving site-specific recombination) is preferable to 

Ac/Ds-like system, as the activation of first system will 
lead to the loss, not reintegration, of the DNA insertion. 

Therefore, the present invention provides a method for 
producing seeds apomictically comprising the steps of 

30 modifying plant cells by incorporating into their genome a 

nucleic acid construct as described above so that the 
modification nucleic acid sequence is expressed in said 
meiotic cells thereby eliminating sexually produced seeds 
in a plant regenerated therefrom. The nucleic acid 

35 construct may be incorporated into the cell genome by 

breeding techniques well known to the person skilled in the 
art or by standard transformation techniques. 
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Also in accordance with the present invention there is 
provided a method of detecting apomictic mutants comprising 
the steps of generating a plant or plant part carrying a 
nucleic acid construct as described above; creating a 
variation for apomixis either in accordance with a breeding 
programme or in accordance with a mutagenesis programme ; 
deriving a suitable self progeny population; inducing the 
activating gene in the self progeny as described above and 
detecting viable seeds produced apomictically . A suitable 
self progeny population will be understood by those skilled 
in the art to mean one where the phenotype of the mutation 
can be revealed e.g. where individuals are at least 
partially homozygous by selfing. Genetic variation could be 
created from breeding and selection of existing genetic 
stocks or by inducing such genetic variation, for example 
by mutagenesis. 

In Figure IB we present an exemplary construct 
permitting the removal of any unwanted DNA sequences from 
transgenic plants. It may be useful, for example, for 
eliminating genes conferring antibiotic resistance, which 
are required for selection of transformed plants, but are 
not necessary needed afterwards. Moreover, the presence of 
such genes can be damaging for the release of transgenic 
crops (or at least controversial, given the current 
political climate and food scares) . The use of Cre/lox 
recombination system to remove selected- genes from 
transgenic plants was already reported (Dale & Ow, 1991) . 
The system described by Dale and Ow (1991) required the 
second round of transformation in order to introduce 
recombinase source under the control of constitutive 
promoter. In our construct meiosis specific promoter (msp) 
drives the transcription of site specific recombinase gene 
(r-se) , which, e.g. as it was described above, eliminates 
DNA" of ^interest (DNA ins) through the mechanism of 
homologous recombination during the first cycle of sexual 
reproduction in transgenic plant unlike the cre/lox-system. 
It will not require any additional rounds of transformation 
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as well as the use of inducible promoters to drive 
recombinase gene. The recombination event can be easily 
monitored in progeny of the primary transformant by placing 
DNA insertion between constitutive promoter (cp) and 
reporter gene (rg) or by using PCR screening. 

Therefore, an aspect of the present invention is to 
prevent seed derived from sexual reproduction so that only 
apomictic seed are set {in plants where apomixis can 
occur) . This would be of interest for species where making 
hybrid seed is uneconomic on a large scale. The hybrid 
seed may be made on a small scale and then be multiplied 
apomictically to produce large scale quantities of hybrid 
seed. Hybrid seed is desirable because hybrids generally 
perform better then inbreds . 

Another aspect of the present invention is to identify 
plants with a tendency to form apomictic seed. This can be 
achieved in connection with a mutagenesis or breeding 
programme (see Fig. 1A) . Further, the present invention 
also relates to a method for removing pieces of nucleic 
acid from transgenic plants (e.g. sequences just needed to 
transf orm/select transgenic cells/plants) . 

Aspects and embodiments of the present invention will 
now be illustrated, by way of example only, with reference 
to the accompanying figures. Further aspects and 
embodiments will be apparent to those skilled in the art. 
All documents mentioned herein are incorporated by 
reference . 

Brief description of the drawings 
In the Figures: 

Figure 1 shows schematic representation of uses of 
meiosis- specif ic promoter in searching for apomictic 
mutants, maintaining the apomictic mechanism of seeds 
production and in removing of unwanted DNA sequences from 
transgenic plants. Abbreviations: tm - transformation 
marker; wp - weak promoter; tx- gene conferring toxic 
phenotype; txi - inhibitor of tx; msp - meiosis -specif ic 
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promoter; ts - target sequence for r-se gene product; DNA 
ins - any DNA sequence sufficiently large to block the 
meiosis-specif ic transcription of tx gene; cp 
constitutive promoter; rg - reporter gene; ip - inducible 
5 promoter; r-se - gene conferring DNA recombination. 

Encircled parts of the constructs are optional. 

Figure 2 shows the restriction map (A) of the AtDMCl 
gene and T-DNA region of binary vector carrying pAtDMCl:GUS 
fusion. The positions of degenerate primers used to produce 
10 AtDMCl -specific probe A are shown by arrows. Probe B, which 

was used for RFLP mapping, corresponds to the 5.9 kb EcoRV 
fragment of genomic clone. The positions and sizes of 15 
exons are shown by open boxes; the ATG and the TGA 
translation initiation and translation termination codons 
15 are also shown. The promoter region used for pAtDMCl:GUS 

translational fusion is shown as solid line. Figure 2(B) 
shows the schematic representation of the plasmid SL1J7753 
T-DNA region introduced into A.thaliana in order to assess 
the specificity of the AtDMCl promoter. The construct 
20 carries a pAtDMCl:GUS translational fusion and the NPT gene 

as a transformation marker. 

Figure 3 shows nucleotide sequence of the AtDMCl 
promoter and adjacent region. The nucleotide sequence from 
-450 to +330 is shown. The transcription start site is 
25 designated +1. The putative TATA box sequence is boxed. The 

alternative putative TATA box is overlined. Two nearly 
complete and two complete direct repeats are underlined. 
Consensus sequences of splicing are indicated by double 
underlining. The ATG codon in the second exon (shown in 
30 bold) was mutagenised in order to introduce an Ncol site 

for translational fusion with the GUS gene. 

Figure 4 shows (A) DNA sequence of the AtDMCl promoter 
including part of the transcribed region. Transcription 
start site located at the position 4691. (B) Alignment of 
35 the promoter regions of the AtDMCl (top strand) and ArLIM15 

(bottom strand) genes upstream of their transcription start 
sites. Only identical sequences in the alignment are shown. 
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Figure 5 shows DNA sequences of plant DMC1 homologues. 
Figure 5(A) shows partial genomic DNA sequence of barley 
DMC1 homologue (HvDMCl) . Putative promoter region is 
located upstream of the translation start site (position 
5 1599) . Genomic DNA region amplified by the primers ME II and 

MEI4 is located between the positions 4001 - 4707. 
Figure 5 (B) shows partial genomic DNA sequence of tomato 
DMC1 homologue (LeDMCl) . Genomic DNA region amplified by 
the primers MEI1 and MEI4 is located between the positions 

10 466 - 1240. 

Figure 6 shows a schematic presentation of the T-DNA 
regions of constructs SLJ11332 (A) and SLJ112315 (B) . 

Figure 7 shows alignment of the amino acid sequences 
of the AtDMCl , LIM15 and DMC1 proteins. Identical and 

15 conserved amino acid residues are boxed in black and grey, 

respectively. Gaps in the alignment are shown by dots. The 
positions and orientations of primers are shown by arrows . 
The vertical open arrowheads indicate the positions of 
introns in the AtDMCl gene. The A and B motifs of the 

20 consensus sequence for the purine nucleotide binding site, 

detected by visual inspection, are underlined. 

Figure 8 shows the restriction map of the promoter 
region of the ArLIMIS gene. The positions of the first five 
exons are shown by open boxes. The insertion of transposon- 

25 like element, Limpetl, flanked by two 9 bp direct repeats 

is also shown. 



Detailed description and exemplification of the invention 
Materials and Methods 
30 Plant material 

The plants used in this study were Arabidopsis 
thaliana Columbia and Landsberg erecta ecotypes. Plants 
were grown in the greenhouse at 25oC under 16 hrs of 
illumination ^and automatic watering conditions. 

35 

Plant transformation 

Plant trans format ion of Arabidopsis thaliana (ecotype 
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Columbia) was performed as described (Bechtold et al . , 
1993). Seeds were harvested three weeks after the vacuum- 
infiltration, sterilised and screened for transf ormants on 
GM + 1% glucose medium (Valvekens et al., 1988) containing 
5 5 0 mg 1-1 kanamycin. 

DNA isolation and DNA gel blot analysis 

Genomic and cosmid DNA isolation from the plant tissue 
and E. coli respectively was performed as described 

10 (Klimyuk et al . , 1995; Sambrook et al . , 1989). The DNA was 

digested with restriction enzymes and DNA fragments were 
separated on 1% agarose gel, transferred to Hybond-N 
membranes (Sambrook et al . , 1989), immobilised on the 
membranes by UV crosslinking (UV Stratagene stratalinker 

15 2400) and subsequently baked on the membranes for 1 hour at 

80oC. The hybridisation procedure was performed as 
described by Church and Gilbert (1984) . DNA fragments for 
use as probes were gel -purified and were labelled using 
commercially available oligolabelling kit (Pharmacia) . 

20 

Subcloning and DNA Sequencing 

All subclonings and template preparations were done 
using the phagemid BlueScript (KS+) vector (Stratagene) . 
The series of unidirectional 250-300 bp deletions were 

25 carried out for large inserts using the Erase-a-Base system 

(Promega) . The sequencing reactions were performed by using 
DyeDeoxy Terminator cycle sequencing kit (Applied 
Biosystems) . In some cases the PCR products were sequenced 
directly. Sequence analysis was carried out on ABI 373A DNA 

30 sequencer (Applied Biosystems) . DNA contig, carrying AtDMCl 

gene sequence, was built up by using Autoassembler 
programme (Applied Biosystems) . 

Construction of pAtDMCl;GUS fusion. 
35 The Ncol site was introduced by site -directed 

mutagenesis (Kunkel, 1985) at the position of the first ATG 
codon (shown in bold in Figure 3) in the second exon of 5.9 
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kb EcoRV subclone of the AtDMCl gene to make plasmid 
SLJ7731. The sequence of the primer designated MUTA used 
for mutagenesis is: 5'- TAG AGC TGA AGALACTOSYL IGG AAC GAG 
CCC CAT GGA GCT CGT TGA GCG TGA- 3 ' . The Ncol site is shown 
in bold. The final construct SLJ7731 was digested with Ncol 
and PstI restriction enzymes, gel purified and ligated with 
small NcoI-PstI fragment of SLJ4D4 . The final plasmid 
SLJ7744, carrying pAtDMCl:GUS 3'ocs fusion in pBS(KS+) 
vector, was digested with Hpal and Smal restriction 
enzymes. The large fragment, released by this digest, was 
gel purified and subcloned into Hpal site of binary vector 
SLJ491, based on pRK290. The final construct SLJ7753 
(Figure 2B) was mobilised into Agrobacterium tumefaciens 
C58C1 strain harbouring the disarmed Ti plasmid pGV2260 
(Deblaere et al . , 1985) and used in transformation 
experiments . 

Histochemlcal GUS assay 

Transformed plants carrying T-DNA of SLJ7753 were 
XGluc stained at the different stages of development as 
previously described (Klimyuk et al., 1995). Squashes of X- 
Gluc stained, pigment -washed flower buds were prepared as 
described below. Specimens were placed in eppendorf tubes 
and vacuum-infiltrated with immersion oil in SpeedVac 
concentrator SVC100H (Savant) . Then buds with remains of 
immersion oil were placed on slide and gently squashed with 
cover slip. The squashes were examined with a Zeiss 
Axiophot microscope. 

The histochemical localisation of GUS activity in 
flower buds was performed on 10 mm cross -sect ions of 
Historesin-embedded plant material as it previously 
described (Dolan et al . , 1994) . For better visualisation of 
the tissue structure some of the cross-sections were 
stained for 1 min with 0.01% Safranin. The sections were 
examined with a Nikon Microphot-SA microscope equipped with 
a dark field condenser. 

The following criteria were used to identify the 



WO 98/28431 PCT/GB97/03546 

stages of flower development (Bowman, 1994) : stage 9 - 
petal primordia stalked at base, pistil length (from the 
top of the stigma to the point of attachment to the 
receptacles) is 0.15 - 0.4 mm; stage 10 - stamen filaments 
5 begin to elongate, pistil length is 0.4 - 0.5 mm; stage 11 

- stigrnatic papillae appear, pistil length is 0.5 -1.5 mm. 

RT-PCR and identification of cDNA ends (RACE) 

RT-PCR analysis of AtDMCl expression was performed 

10 with MEI1U (5' -GGA GGG AAT GGA AAA GTG-3 ' ) primers using 

1/ig of total RNA from 12 day-old seedlings, leaves and 
floral buds as a template. The positions and orientations 
of the primers coincide with those of their degenerate 
homologues, MEI1 and MEI4 (Figure 2A) . As an internal 

15 control, primers for Antirrhinum ma jus polyubiquitin mRNA 

(GenBank accession number X67957) , kindly provided by Dr M. 
O'Dell, were used: 1392 (5'-CAG ATC TTT GTG AAG ACT CTG-3') 
And 1393 (5' -GGA CTC CTT CTG GAT GTT GTA-3 ' ) . Primer 1393, 
directed in antisense orientation was used for cDNA strand 

20 synthesis. 

RNA in situ hybridization 

Digoxigenin- labeling of RNA probes, tissue preparation and 
in situ hybridization were performed as described by 
25 Bradley et al . (1993) and Coen et al . (1990). 

X-ray treatment of Arabidopsis plants 

Twelve-day-old Arabidopsis plants grown in small (5cm in 
diameter) plastic petri dishes were exposed to 5 and 10 
30 krad of ionizing irradiation. The ABB 6 MV linear 

accelerator served as a source of radiation. One hour 
after irradiation plants were used for X-Gluc staining and 
RNA isolation. Non- irradiated plants served as a control. 

35 RNA gel blot analysis 

RNA samples were separated on 1.4% agarose- formaldehyde 
gels as described by Ausebel et al . (1987). Agarose gel 
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was rinsed in several changes of sterile DEP-treated 
distilled water in order to remove formaldehyde and blotted 
overnight in lOxSSC to Hybond-N membrane. Membrane was 
carefully rinsed with deionized sterile water and RNA was 
5 immobilized to membrane and hybridized with probe as 

described above for DNA gel blot analysis. 

EXAMPLE 1 

Isolation of the AtDMCl gene 

10 In order to isolate the Arabidopsis thaliana DMC1 

homologue, five different degenerate primers were designed 
corresponding to amino acid motifs conserved in LIM15 and 
DMC1 proteins: MEI1- 5' GG(N) AA(GA) GT (N) GC (N) TA(CT) 
AT (ACT) GA 3'; MEI2 - 5' GA(CT) AC (N) GA(GA) GG (N) AC (N) 

15 TT(CT) (CA)G(N) CC3 ' ; MEI3 - 5' A(GA)(TC) TT(TC) TG (TC) 

TG(N) C(GT) (TC) TC 3'; MEI4 - 5' AC(N) GC (N) AC (GA) TT (GA) 
AA(CT) TC(CT) TC(N) GC 3'; MEI5 - 5' GC (GA) TG (N) GC (N) 
A (GA) N AC (GA) TG(N) CC(N) CC 3'. The positions of primers 
and their orientations are shown in Figure 7. Different 

20 combinations of primers were used for PCRs with Arabidopsis 

genomic DNA as a template. The PCR reactions were performed 
with 0.05 fig of A. thaliana (Landsberg erecta) DNA as a 
template in a volume of 50 /xl in the presence of 2 /zM of 
each of the two selected primers in a buffer containing 250 

25 fiM dNTPs (Pharmacia) 10 mM Tris-HCl, pH8.3, 50 mM KC1, 2.5 

mM MgCl2, 0.05% Nonidet P-40, and 2.5 units .of "AmpliTaq" 
thermostable DNA polymerase (Perkin Elmer Cetus) . Cycling 
conditions were: 94 oC for 15 sec; 50 oC for 30 sec; 72 oC 
for 2 min; 35 cycles, followed by a 10 min extension at 72 

30 oC. The largest PCR products were reamplified with nested 

primers. Only two sets of primers, MEI1 - MEI5 and MEI1 - 
MEI4, proved effective. Primers MEI2 and MEI3 did not 
amplify the expected size class either together or in 
combination with other primers. Subsequent analysis showed 

35 that they anneal to parts of the cDNA sequence that are 

interrupted by introns in genomic DNA (Figure 7) . The major 
679 bp PCR band, obtained after the reamplif ication with 
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MEI1 and MEI4 primers of the first round PCR product (MEI1 
and MEI5 primers) , was blunt -ended by treatment with T4 
polymerase, subcloned into the EcoRV site of pBS(KS+) and 
sequenced. The fragment, designated as probe A (Figure 2A) , 
5 encoded 111 amino acid sequence with 93% of identity to the 

homologous part of LIM15 15 protein (Kobayashi et al . , 
1993) . 

Two unique primers, MEI1U (5' GGA GGG AAT GGA AAA GTG 
3') and MEI4U (5' GGA ACG TTG AAC TCC TCT GCA AT 3'), that 

10 annealed to the same location as their degenerate 

homologues (Figure 2A) , were synthesised and used for the 
PCR screening of DNA pools prepared from 68 plates (384 
clones per plate) of an four genomic equivalents 
w Arabidopsis cosmid library. The library was made on the 

15 basis of CLD04541 binary vector and was kindly provided by 

C. Lister and C. Dean (Cambridge Laboratory, JIC, Norwich) . 
The probing of filter replicas from positive plates with 
probe A recovered five cosmid clones. One of the clones, 
called 64/23/C, was used for the restriction mapping and 

20 subcloning of 5.9 kb EcoRV and 3.6 kb Clal overlapping 

fragments into the pBS (KS+) vector. Both strands of two 
overlapping fragments, encompassing 8 kb of cosmid insert, 
were completely sequenced. 

The DNA sequencing and sequence analysis was performed 

25 as described above. Database searches with the BlastX 

£ program (Altschul et al . , 1990) revealed a gene within this 

region, designated AtDMCl, whose highest homology was to 
the ArLIMlB protein followed by LIM15 and its human and 
mouse homologues as well as several RAD51 homologues and 

30 other RecA-like proteins, including yeast DMC1, from 

different eukaryotic organisms. The GenBank accession 
number for the AtDMCl gene is U76670. DNA gel blot 
analysis of AtDMCl revealed that this is a single copy 
gene. 



35 



Molecular mapping 

Labelled 5.9 kb EcoRV fragment of AtDMCl genomic clone 
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(designated as probe B; Figure 2A) was initially used to 
probe Southern blot of Arabidopsis thaliana (ecotypes 
Landsberg erecta and Columbia) DNA to detect specific 
RFLPs. AtDMCl was shown to be polymorphic with EcoRI , 
5 EcoRV, Xbal, Hindlll, Hpal and BstEI enzymes between A. 

thaliana Landsberg erecta and Columbia ecotypes. 
Subsequently, Southern blots of EcoRI digested DNAs 
isolated from 41 recombinant inbred (RI) lines between the 
ecotypes Landsberg erecta and Columbia (Lister and Dean, 

10 1993) were hybridised with probe B. The molecular mapping 

of AtDMCl gene was carried out by using the program 
MAPMAKER V.1.0 (Lander et al . , 1987), and data for the 
segregation of 92 single-copy sequences covering the five 
Arabidopsis chromosomes. 

15 The RFLP for the AtDMCl was mapped to the top arm of 

chromosome 3 between the m560B2 and g4711 molecular markers 
at the distances of 4.8 cM and 7.7 cM from each of the 
markers respectively. In addition mapping of AtDMCl using 
recombinant inbred lines of Landsberg erecta and Columbia 

20 ecotypes led to the identification of a single locus, and 

the conclusion that AtDMCl and ArLIMIS are allelic. 



Characterisation of the AtDMCl transcript by RT-PCR 

Screening of an Arabidopsis inflorescence cDNA 

25 library made using the ZAPII vector (distributed by J. 

Dangl) with probe A did not reveal any positive clones, 
probably due to the low abundance or transitory presence of 
the AtDMCl transcripts. Comparison of conceptual 
translations of the AtDMCl gene to the LIM15 cDNA sequence 

30 showed a very high degree of amino acid sequence homology 

(more than 80% identity) . On the basis of this homology 14 
exons and 13 introns were predicted within the AtDMCl gene. 
In order to confirm the prediction and to define precisely 
the boundaries of .exons, RT-PCR , 5' RACE and 3' RACE were 

35 carried out as described (Frohman et al . , 1988; Frohman, 

1989) using total RNA from floral buds as a template. 

Total RNA was prepared from Arabidopsis 0.1 - 1 mm 
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floral buds and young leaves by a standard method 
(Harpster et al., 1988) or by method described by Napoli et 
al., (1990). Both methods produced good quality RNA, but 
the last one was preferred when small quantities of total 
5 RNA were required. 

For RT-PCR complementary DNA synthesis was carried out 
with 2 /-ig of total RNA and primer MEI6U (5' ATC CTT CGC GTC 
AGC AAT GCC 3'). Second strand synthesis and PCR were 
carried out with primers 5RN (5' ATG CAG CTC GTT GAG CGT 
10 GAA 3') and MEI6U. PCR mixture was heated for 5 min at 

95oC, followed by 35 cycles of amplifications (94°C / 40 sec; 
56°C, 1 min; 72°C / 2 min) and 10 min final extension at 
72°C. 

J According to the prediction, the primers should 

15 amplify 996 bp and 258 6 bp fragments from the cDNA and 

genomic DNA respectively. Two expected bands were observed 
after the elect rophoretic separation of RT-PCR products and 
the 996 bp fragment was cloned and sequenced. The sequence 
confirmed the prediction, except that minor corrections 

20 were necessary to the predicted boundaries of some exons. 

Further steps were undertaken to identify the full 
length transcript, in particular the transcription start 
site. The GENE F I NDER program (developed by S. Klostermann, 
Max-Planck Institute, Martinsried) predicted an additional 

25 short exon 5' to the first of the 14 exons that were 

^ initially identified. 

For identification of the transcription start site, 5' 
RACE complementary DNA synthesis was performed with 5 fig of 
total RNA and primer 5R2 (5' TCA GCA GCT TCA CAG ATT TTG 

30 3'). Second strand cDNA synthesis and first PCR 

amplification were done with QI, QT (Frohman, 1989) and 
5R2N (5' TCA ACT TTG GCC TCA GAT AAA C 3') primers. 
Reamplification was done with QI and 5R2N1 (5' TTC TTG GTA 
TGC ATC ATG AGALACTOSYL IGG 3 ' ) primers . . Several 5 ' RACE 

35 products were size-selected by gel electrophoresis, 

subcloned and sequenced as described above for genomic DNA. 
The beginning of the longest was presumed to correspond to 



27 



WO 98/28431 



PCT/GB97/03546 



the transcriptional start site, designated +1 in Figure 3. 
This result confirmed the existence of additional exon and 
allowed us to define the promoter of the AtDMCl gene. 

The promoter contains some interesting motifs upstream 
of the putative TATA box. Four direct repeats were found at 
the positions -285 to -397 (Figure 3) . Two nearly complete 
repeats, 9 bp and 11 bp, are flanked by two complete 15 bp 
repeats. Interestingly, three of the repeats contain the 
short palindromic repeat ATGCAT at their 3' ends. Transfac 
v 3.2 database search for homology with known transcription 
factor binding sites using TESS - Transcription Element 
Search Software (http/ /agave, humgen. upenn. edu/ tess/ 
index, html) revealed that these repeats contain putative 
transcription factor binding domains. The repeats contain 
6 - 11 bp long sequences that show homology to the 
transcription factor binding sites for quail transcription 
factor EFII (Sealey & Chalkley. 1987) , human glucocorticoid 
receptor (Haerd et al . , 1990) and Xenopus octamer-binding 
factor (Tebb & Mattaj , 1989). 

For 3' RACE complementary DMA synthesis was carried 
out with 5 fig of total RNA and QT primer (Frohman, 1989) . 
Second- strand cDNA synthesis and first PCR were done with 
QO and GSP1 (5' TCT GGG AAA ACC CAA TAA 3') primers. 
Reamplif ication was done with QI and GSP2 (5' GCA CAT ACC 
CTT TGT GTC 3/) primers. Results of 3' RACE revealed a 260 
bp untranslated region (excluding the polyA tail) . The full 
length mRNA transcript sequence was inferred from compiling 
the RT-PCR and RACE data. 

This transcript codes 344 amino acids putative AtDMCl 
protein which exhibits significant sequence similarity to 
lily LIM15 (Kobayashi et al., 1993) and yeast DMC1 (Bishop 
et al., 1992) meiotic proteins. The optimal alignment 
showed 84.3% of amino acid identity and 93.6% amino acid 
similarity between AtDMCl and LIM15 and 51.8% of amino acid 
identity and 71.1% similarity between AtDMCl and DMC1 . 
AtDMCl exhibits a somewhat lower level of homology with the 
yeast RAD51, a protein required for mitotic and meiotic 
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recombination and resistance to ionising radiation, (48.5% 
and 70.6% of identity and similarity respectively). AtDMCl, 
like other RecA-like DNA strand- exchange proteins 
(Kowalczykowski and Eggleston, 1994) , possesses consensus 
5 ATP binding sites (Walker et al . , 1982). 

Comparison of the AtDMCl and the ArLIMIS (Sato et al . , 
1995) transcripts and genomic sequences confirmed positions 
for most of the exon/intron junctions, but the borders of 
intron 14 of ArLIM15 gene appears to have been determined 
10 incorrectly. As a result, two amino acids, alanine (A) and 

glutamic acid (E) (positions 326 and 327 for AtDMCl 
protein) were excluded from the predicted protein sequence 
of ArLIMIS (Sato et al . , 1995). It is unlikely that 
splicing for this particular intron is different between 
15 Columbia and Landsberg erecta ecotypes. There is amino acid 

substitution at the position 103 (leucine in Columbia, 
serine in Landsberg erecta ) . There is also a difference in 
the sizes of transcripts: AtDMCl transcription start site 
is 4 bp upstream of that for the ArLIMIS and the last exon 
20 is 75 bp longer. 

In situ hybridization analysis of AtDMCl expression 

In order to test whether or not the AtDMCl gene is 
expressed at the time of meiosis, in situ hybridisation 

25 analysis using cross-sections of the whole inflorescence 

and DIG-labelled antisense AtDMCl RNA as - a probe was 
carried out. Digoxigenin- labelling of RNA probes, tissue 
preparation and in situ hybridisation were performed as 
described by Bradley et al . (1993) and Coen et al. (1990). 

30 The expression of AtDMCl gene in whorl 3 is restricted 

to pollen mother cells . No expression was detected in 
tapetum. In whorl 4 hybridisation signal was restricted to 
megaspore mother cells of ovules. DIG-labelled sense AtDMCl 
RNA was used as a negative control and did not reveal any 

35 hybridisation signal at the stages of flower development at 

which the expression of AtDMCl might take place. Weak 
nonspecific signal was detected in the mature pollen grains 
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following hybridization with sense and antisense AtDMCl 
RNAs (data not shown) . No signals were detected in 
postmeiotic ovules and developing embryos as well as in the 
analysed vegetative tissues. 

5 

Characterization of expression of a pAtDMCl:GUS fusion 

The meiosis-associated expression of the AtDMCl led us 
to investigate whether the AtDMCl promoter could direct 
meiosis-associated expression of the GUS reporter gene. A 
10 translational fusion was made between the putative AtDMCl 

promoter and coding sequences of the GUS reporter gene 
(see: Construction of pAtDMCl:GUS fusion in Materials and 
Methods) . 

The fusion consists of a 3.3 kb DNA fragment 
15 containing the AtDMCl promoter fused in frame with GUS at 

the position of the methionine residue located in the 
second exon (Figures 2b and 3) . The sequence of the 3.3 kb 
DNA fragment of the AtDMCl gene used to drive meiosis- 
specific expression of a GUS reporter gene and its 
20 alignment with previously published sequence of the ArLIMIS 

gene are shown in Figure 3 . As a result the GUS protein 
carries 13 AtDMCl amino acid residues at its amino 
terminus. These residues appear to be neutral with respect 
to GUS activity. A schematic representation of the T-DNA 
25 region carrying the pAtDMCl:GUS fusion is shown in Figure 

2b. 

Eight primary transf ormants with this pAtDMCl:GUS 
fusion were obtained and analysed for the presence of GUS 
expression patterns* Three primary transf ormants did not 

30 reveal any GUS activity, one showed ubiquitous GUS 

expression and four transf ormants revealed GUS expression 
patterns which were restricted to the whorl 3 and whorl 4 
of flowers. No GUS expression was detected in roots, leaves 
and stems of these four transf ormants, except that one of 

35 them exhibited GUS expression in damaged tissues. 

There was also X-Gluc staining in the receptacles of 
some of the open flowers, but this pattern is very common 
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for plants carrying the GUS gene and may be considered as 
non-specific (Klimyuk et al . , 1995). 

The GUS expression initially appears in the anthers of 
approximately 0.2 mm long inflorescence buds and later in 
the carpels of more advanced, approximately 1 mm long, 
buds . Meiosis in anthers and carpels does not coincide 
(Bowman, 1994) . The beginning of meiotic prophase I in 
anthers takes place at stage 9 of flower development, while 
the meiotic events in carpels do not occur until stage 11 
(Bowman, 1994) . Cross-sections of X-Gluc-stained buds at 
the different stages of development showed that GUS 
expression first appeared in pollen mother cells of anthers 
from inflorescence buds early in stage 9. In late stage 9 
GUS staining increased dramatically. GUS expression in 
ovaries was first detected at stage 11 of flower 
development . 

This temporal and spatial coincidence of reporter gene 
expression driven by the AtDMCl promoter with the stages of 
floral bud development corresponding to the time of meiosis 
provides indication that AtDMCl promoter can drive meiosis - 
associated GUS expression. The presence of residual GUS 
activity adjacent to the main sites of localisation in 
anthers and ovaries is the result of artefactual indigo 
blue dye formation in surrounding tissue. The specificity 
of dye localisation can be improved by using 0.2 - 1 mM 
potassium f errocyanide/f erricyanide in staining solution 
{Jefferson, 1987) . However, this we found compromised the 
sensitivity of the protocol resulting in loss of the weak 
GUS expression patterns at the early stages of meiosis. 
Inflorescences of nontransf ormed Arabidopsis plants were 
used as a negative control and they did not reveal any 
patterns of X-Gluc staining. 

. AtDMCl and ArLIMIS genes are different within the promoter 
regions 

Alignment of the AtDMCl and ArLIMIS (Sato et al . , 
1995) gene sequences revealed that these genes, isolated 
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from ecotypes Landsberg erecta and Columbia respectively, 
are virtually identical throughout their respective coding 
regions, but there are significant differences in the 5' 
sequences. The sequences diverge 219 bp upstream of the 
5 transcriptional start site of AtDMCl gene. 

Different combinations of primers ATM2 (5' GCA ACT GAA 
TTT GTT TTC GTT TG 3'), ATM1 (5' TTG ATT AGT GGA TCC GCA 
AAC AA 3 ' ) and AR2 ( 5 ' TAG ATG AAA CGA GTT TGA CAC ATG 3 ' ) 
were used for PCR amplification of genomic DNA, isolated 

10 from Landsberg erecta and Columbia ecotypes of A. thaliana. 

The PCR amplification was performed as it described above 
for isolation of genomic clones, except that each primer 
concentration was 0.1 /xM and cycling conditions were: 94°C 
for 20 sec; 58°C for 20 sec; 7$ C for 2 min; 35 cycles, 

15 followed by a 10 min extension at 72°C. Primers Tl (5' GGG 

AAT GTT CCA ATA TAA G 3') and T2 (5' GAG AAT ATT ACA CTC 
TTA AA 3') were used for amplification of Limpetl 
sequences. The set of primers AR2 - ATM2 produced the 
expected 446 bp PCR band for Columbia ecotype, but nothing 

20 for Landsberg erecta, whereas primers ATM1 - ATM2 produced 

the expected 342 bp PCR product for Landsberg erecta and 
2.2 kb PCR product from Columbia ecotype. The positions and 
orientations of primers are shown in Figure 8 . 

This result confirmed that the AtDMCl and ArLIMIS 

25 sequences described above are not derived from chimeric 

clones and reflect genuine differences in gene structure. 
To determine the nature of the rearrangement, sequence of 
the ArLIM15 promoter region was extended. This indicated 
that the difference between the AtDMCl and ArLIMIS promoter 

30 regions was caused by a 1874 bp DNA fragment present in the 

ArLIMIS gene but absent in the AtDMCl (Figure 8) . The 
fragment is flanked by two 9 bp direct repeats and has 26 
bp imperfect inverted repeats, with 73% of identity, at its 
termini; internal to these are two shorter inverted repeats 

35 (GenBank accession number U76697) . The general structure of 

the fragment suggests that it is a transposable element and 
that it is probably a member of the class of transposons 
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that exhibit DNA-mediated transposition. The two 9 bp 
direct repeats at the ends of the element probably 
represent a target site duplication. Consistent with this 
suggestion such a duplication is absent within the AtDMCl 
promoter. 

The element, which we have designated Limpetl (LIM 
promoter entrenched transposon-like element) , was used as 
a probe to blots of genomic DNA from three different 
ecotypes of Arabidopsis thaliana. Primers Tl and T2 were 
used to produce Limpetl hybridisation probe by 
reamplif ication of the ATM1 - ATM2 PCR product from ecotype 
Columbia. DNA gel blot analysis indicated that Columbia 
ecotype contains Limpetl and at least two additional 
closely related elements, but no Limpetl or related 
sequences were found in Landsberg erect a . 

In order to assess whether or not the insertion of 
Limpetl affects the expression pattern of ArLIM15 , relative 
to the pattern exhibited by AtDMCl, RT-PCR was performed 
with MEI1U and MEI4U primers, using equal amount of total 
RNA isolated from floral buds and leaves of both ecotypes. 
The results of DNA gel blot analysis of RT-PCR products 
suggest that two ecotypes exhibit very similar and 
relatively high levels of expression in inflorescence 
tissue and similar but much lower levels of expression in 
the leaves . 

Isolation of barley and tomato DMC1 homologues 

Degenerate primers MEI1, MEI4 and MEI5, which proved 
to be efficient for isolation of AtDMCl gene, were used to 
generate gene-specific probes for barley and tomato DMC1 
homologues designated as HvDMCl and LeDMCl, respectively. 
The first amplification was performed with MEI1 - MEI5 
primers at the conditions described above for the isolation 
of the AtDMCl gene-specific probe. The firs.t PCR product 
was 10 times diluted with sterile distilled water and 1/il 
of it was used as a template for re -ampl if ication with MEI1 
- MEI4 set of primers. Agarose gel electrophoresis of PCR 



33 



WO 98/28431 



PCT/GB97/03546 



products showed single major DNA band of approximately 700 
and 800 bp long for barley and tomato, respectively. Gel- 
purified DNA bands were blunt -ended, cloned into the EcoRV 
site of pBS(KS+) and sequenced. Comparative analysis of the 
5 clones revealed that they encode amino acid sequences with 

more than 90% of identity to the homologous parts of LIM15 
and AtDMCl proteins. These clones encoding part of the 
LeDMCl and HvDMCl genes, were used to screen the tomato 
cosmid (Dixon et al., 1996) and barley lambda (Stratagene) 

10 genomic libraries, respectively . Partial sequences of 

genomic clones recovered from the screens are shown in 
Figure 5. Database similarity search using BlastX program 
(Altshul et al., 1990) helped to identify the translation 
start codon of HvDMCl gene. Only HvDMCl genomic clone was 

15 sequenced far enough upstream of the coding region to 

confer at least 1.5 kb of the putative promoter sequences 
(Figure 5a) . 

Success in isolation of DMC1 homologues from 
Arabidopsis, barley and tomato using set of MEI1, MEI4 

20 and MEI5 degenerate primers is a convincing example that 

these primers can be used to isolate DMC1 homologues from 
monocot and dicot plant species. Comparative analysis of 
the coding regions of plant DMC1 homologues including 
soybean DMC1 homologue cDNA ( Database accession number 

25 U66836, direct submission 13 August 1996) and recently 

isolated partial sequence of rice LIMl-5-like gene 
(Database accession number U85613, direct submission 16 
January 1997) demonstrated that they are at least 70% 
identical. The most variable part of the coding sequences 

30 is located at the 5' end of cDNA and encompass 

approximately 80 bp from the translation start site. It 
also revealed that plant DMC1 homologues have conserved 
exon/intron structure, which explains the success of 
using MEI1, MEI2 and MEI4 primers in different species. 

35 

EXAMPLE II 

Construction of pAtDMCl zbarnase fusion for meiotic cells 
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ablation. 

The plasmid SLJ7744, carrying pAtDMCl:GUS 3'ocs 
fusion in pBS(KS+) vector, was digested with Ncol and 
EcoRV, large fragment was gel -purified and ligated with 
5 Ncol - EcoRV fragment of pJB142 carrying barnase-barstar- 

CaMV polyA fusion. The final construct was Xbal digested 
and 6 kb gel -purified fragment, carrying pAtDMCl rbarnase- 
barstar-CaMV polyA signal, was cloned into the Xbal site 
of the binary vector SLJ755I5. The final construct 
10 SLJ113 32 (Figure 6a) was mobilised into Agrobacterium 

tumefaciens C58C1 strain and used in transformation 
experiments . 

Construction of vector carrying AtDMCl cDNA fused to 

15 potato virus X (PVX) amplicon 

The first 450 bp of the AtDMCl cDNA were amplified 
with Clal/1 (5 ' - CAAAATTCTATCGATCTCACTCTTCCAAGCTTA- 3 ' ) 
and Clal/2 (5'- CAAAAGCCTCTGTGATCGATGAGGTTTCAATTCCACC - 
3') primers with introduced Clal sites (shown in bold). 

20 PCR fragment was digested with Clal, gel -purified and 

cloned into the Clal site of binary vector pVDH401 
carrying PVX amplicon (Angell & Baulcombe, 1997) . The 
final construct SLJ112315 (Figure 6b) was mobilised into 
Agrobacterium strain and used in transformation 

25 experiments as it described in Materials and Methods. 

It was shown that amplicon-mediated gene silencing 
can be used as an efficient tool to suppress endogenous 
RNA sharing homology with the transgene (Angell & 
Baulcombe, 1997) . Considering that yeast dmcl mutant 

30 fails to form normal synaptonemal complex and, as a 

result, produces a very low percentage of viable spores 
(Bishop et al., 1992), amplicon-mediated silencing of the 
AtDMCl expression can be an alternative way to switch off 
the mechanism of sexual reproduction in Arabidopsis. 

35 Coding sequences of Arabidopsis, tomato, barley and rice 

(Database accession number U85613, direct submission 16 
January 1997) homologues share 70 - 80% of homology. 
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Considering this, it is possible that amplicon- based 
AtDMCl cDNA can efficiently silence DMC1 homologues 
expression in other plant species. Alternatively, host 
DMC1 homologue cDNA can be used in amplicon to switch off 
5 the mechanism of sexual reproduction and to achieve 

apomictic seed production. 

The remarkable specificity of the AtDMCl which 
confers tight developmental regulation of reporter gene 
expression in whorl 3 and whorl 4, may serve as a model 

10 system to study the mechanism of such regulation. Four 

direct repeats identified upstream of AtDMCl TATA box 
(Figure 3) may play an important role in developmental 
regulation of AtDMCl gene expression. Functional 
dissection of the promoter and site-directed mutagenesis 

15 may help to identify cis-regulatory sequences controlling 

the transcription of the AtDMCl gene and to clarify 
possible involvement of direct repeats in such 
regulation. Such sequences may be used as "baits" in one- 
hybrid system. 

20 Identification of the minimal region of the AtDMCl 

promoter possessing all the sequences necessary to drive 
meiosis-specif ic transcription can be achieved by 
preparing constructs carrying truncated AtDMCl promoter 
fused to the GUS reporter gene, transforming them into 

25 Arabidopsis and testing them for the ability to confer 

meiosis-specif ic GUS expression. These experiments will 
produce a truncated meiosis-specif ic promoter or even 
meiosis-specif ic enhancer sequences which will be more 
suitable for making constructs shorter than its present 

30 3 kb long version. Meiosis-specif ic enhancers may be 

fused to any heterologous "minimal" promoter, for 
example, with -67 bp 35S promoter, which is not able to 
drive transcription, but contains RNA polymerase binding 
site. &*, ' 

35 The present inventors have made constructs carrying 

different types of fusion of the AtDMCl promoter to 
different transposase genes (Ac, Spm) as well as to the 
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Tntl tobacco retrotransposon. 

In the case of Tntl retrotransposon, part of 5'LTR 
upstream of retrotransposon TATA box was replaced with 
the AtDMCl promoter sequences located upstream of the 
AtDMCl TATA box. Transcriptional fusion within 
nontranslated leader of AtDMCl gene was made for Spm 
transposase and pAtDMCl : 10ATG . Ac transposase 

translational fusion was performed within the second exon 
of AtDMCl gene, as it described for pAtDMCl :GUS construct 
(Figure 2B) . 

Isolation of the promoters of DMC1 homologues from 
other plant species and testing their abilities as well 
as the ability of pAtDMCl to drive meiosis-specif ic 
transcription in heterologous plant systems can easily be 
envisaged. 
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Claims 

1. A nucleic acid molecule comprising 

(a) the Arabidopis meiosis-specif ic DMC1 
5 (AtDMCl) gene promoter; or 

(b) a meiosis-specif ic promoter homologous to 
(a) but from another plant species; or 

(c) a meiosis-specif ic promoter of the gene of 
a homologous DMC1 protein from another plant species; or 

10 (d) a meiosis-specif ic promoter variant, 

mutant, allele or derivative of (a), (b) or (c) ; or 

(e) a portion of (a) , (b) , (c) or (d) 
sufficient to confer meiosis-specif ic character to a 
promoter containing it . 



20 



2. A nucleic acid molecule according to claim 1 wherein 
the homologous DMC1 protein is encoded by a nucleic acid 
sequence having at least 55% homology with the nucleic 
acid sequence of AtDMCl. 



3. A promoter comprising at least a portion of a 
.nucleic acid molecule according to claim 1 or claim 2 
sufficient to confer meiosis-specif ic character to the 
promoter. 

25 

£ 4. A promoter according to claim 3 comprising all or 

part of the sequence shown in Fig. 4(A). 

5. A promoter according to claim 4 wherein the sequence 
30 is derived from nucleic acid which lies 5' of nucleotide 

4473 in the sequence shown in Fig. 4(A). 

6. A nucleic acid construct comprising a meiosis- 
specif ic promoter according to any one of the preceding 

35 claims operably linked to a heterologous nucleic acid 

sequence . 



41 



WO 98/28431 



PCT/GB97/03546 



7. A nucleic acid construct according to claim 6 
wherein the heterologous nucleic acid sequence is a gene . 

8. A nucleic acid construct according to claim 7 
wherein the heterologous gene encodes a cytotoxic 
protein. 

9. A nucleic acid vector comprising a promoter or 
nucleic acid construct according to any one of the 
preceding claims. 

10. A recombinant host cell containing a promoter 
according to any one of claims 1 to 5, or a nucleic acid 
construct according to any one of claims 6 to 8 or a 
nucleic acid vector according to claim 9, optionally 
integrated into the genome of the host cell. 

11. A host cell according to claim 10 being a plant 
cell. 

12. A method of producing a plant cell of claim 11 
comprising the steps of introducing a nucleic acid 
comprising a promoter, construct or vector into a plant 
cell; and causing or allowing recombination between the 
nucleic acid and the plant cell genome to introduce the 
sequence of said nucleic acid into the genome. 

13. Use of a promoter, construct or vector according to 
any one of claims 1 to 9 in the production of a 
transgenic plant . 

14. A plant cell comprising in its genome a promoter, 
construct or vector according to any one of claims 1 to 
9 . " 

15. A plant, part or propagule thereof, seed, selfed or 
hybrid progeny or descendant thereof comprising a plant 
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cell according to claim 16. 

16. A method of inducing meiosis-specif ic expression of 
a nucleic acid sequence in a transgenic plant cell 
comprising the steps of 

(a) transforming a plant cell with a nucleic 
acid comprising a promoter, construct or vector according 
to any one of claims 1 to 9 so as to introduce said 
nucleic acid into the genome of said plant cell so that 
expression of a nucleic acid sequence is regulated by 
meiosis-specif ic promoter; and optionally 

(b) regenerating a plant from said plant cell. 

17. A method of producing a sterile plant comprising the 
steps of modifying plant cells by incorporating into 
their genome nucleic acid comprising a modification 
nucleic acid sequence and a promoter, construct or vector 
according to any one of claims 1 to 9 so that said 
modification nucleic acid sequence is expressed in said 
meiotic cells under the control of the meiosis-specif ic 
promoter, thereby altering the meiotic cell cycle and 
rendering the plant sterile. 

18. A method according to claim 17 wherein the construct 
further comprises an insertion sequence flanked by target 
sequences and positioned in between said nucleic acid 
molecule and said modification nucleic acid sequence, 
said target sequences being under- the control of an 
inducible promoter such that on induction the insertion 
sequence is eliminated. 

19. A method according to claim 17 or 18 wherein the 
target sequences are lox recombination sites which are 
activated, by ere recombinase under the control of the 
inducible promoter. 

20. A method according to any one of claims 17 to 19 
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wherein the modification nucleic acid sequence encodes a 
cytotoxic protein which leads to the ablation of meiosis 
cells . 



5 21. A method of isolating DMC1 homologues comprising 

the steps of PCR on template nucleic acid using any one 
or more of the following degenerate primers 

MEI1- 5' GG(N) AA(GA) GT(N) GC (N) TA(CT) AT (ACT) GA 

3' ; 

10 MEI4 - 5' AC(N) GC (N) AC (GA) TT (GA) AA(CT) TC (CT) 

TC (N) GC 3 ' ; 

MEI5 - 5' GC (GA) TG (N) GC (N) A (GA) N AC (GA) TG(N) 
CC(N) CC 3' ; and 

isolating said PCR product. 

15 

22. A method according to claim 21 wherein the MEI1 and 
MEI5 or MEI1 and MEI4 primers are used in combination. 



23. A method according to claim 21 or claim 22 wherein 
20 the MEI1 primer is 5' GGA GGG AAT GGA AAA GTG 3' and the 

MEI4 is 5' GCA ACG TTG AAC TCC TCT GCA AT 3'. 



24. A nucleic acid molecule comprising any one of the 
following sequences for use as identification probes or 
25 PCR primers 

MEI1- 5' GG(N) AA(GA) GT (N) GC (N) TA(CT) AT (ACT) GA 

3' ; 

MEI4- 5' AC(N) GC(N) AC (GA) TT (GA) AA (CT) TC(CT) 
TC(N) GC 3' ; 

30 MEI5- 5' GC (GA) TG (N) GC (N) A (GA) N AC (GA) TG(N) 

CC(N) CC 3' ; 

MEI1U- 5' GGA GGG AAT GGA AAA GTG 3'; and 
MEI4U- 5' GCA ACG TTG AAC TCC TCT GCA AT 3'. 

■" ! ' . (*--;- 

35 25. A method of transposon tagging comprising the steps 

of creating a construct comprising a promoter according 
to any one of claims 1 to 5 operably linked to a 
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transposase required for transposition of a transposable 
element; transforming a plant cell with said construct 
such that the transposase is driven by said promoter; and 
determining transposition events. 

5 

26. A nucleic acid construct for selectively expressing 
a 

nucleic acid sequence of interest comprising 

a promoter according to any one of claims 1 to 5 
10 upstream of said nucleic acid sequence of interest; 

a nucleic acid insertion sequence flanked by target 
sequences and positioned in between said promoter and 
said nucleic acid sequence of interest; and 

an activating gene under the control of an inducible 
15 promoter such that on induction said activating gene 

expresses an activator which activates said target 
sequences thereby eliminating said insertion sequence 
allowing the nucleic acid of interest to be expressed 
under the control of the promoter. 

20 

27. A nucleic acid construct according to claim 26 
wherein the activating gene is a DNA recombinase gene. 

28. A nucleic acid construct according to claim 27 
25 wherein the activating gene is ere recombinase and the 

0 target sequences are lox recombination sites. 

29. A method for selectively expressing a nucleic acid 
sequence of interest in a plant comprising the steps of 

30 transforming said plant cell with a nucleic acid 

construct according to any one of claims 26 to 28; 

regenerating a plant from said plant cell; and 
inducing said inducible promoter. 

35 30. A method of producing seeds apomictically comprising 

the steps of modifying plant cells by incorporating into 
their genome a nucleic acid construct according to any 
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one of claims 26 to 28 so that said modification nucleic 
acid sequence is expressed in said meiotic cells thereby- 
eliminating sexually produced seeds in a plant 
regenerated therefrom. 

5 

31. A method according to claim 30 wherein the construct 
is incorporated into the genome by breeding. 

32. A method according to claim 30 wherein the construct 
10 is incorporated into the genome by transformation. 

33. A method of detecting apomictic mutants comprising 
the steps of generating a plant or plant part carrying a 
nucleic acid construct according to any one of claims 2 6 

15 to 29; creating a variation for apomixis in accordance 

with a breeding programme wherein there is a natural 
variation for apomixis; deriving a suitable self progeny 
population; inducing said activating gene in the self- 
progeny and detecting viable seeds produced 

20 apomictically . 

34. A method of detecting apomictic mutants comprising 
the steps of generating a plant or plant part carrying a 
nucleic acid construct according to any one of claims 26 

25 to 29; mutagenising said plant or plant part in 

accordance with a mutagenesis programme;, deriving a 
suitable self progeny population; inducing said 
activating gene in the self -progeny and detecting viable 
seeds produced apomictically. 

30 

35. A nucleic acid construct for selectively removing 
one or more transgenes from the genome of a transgeneic 
plant comprising 

a promoter according to any one of claims 1 to 5 
35 operably linked to an activator gene for expressing an 

activator; and 

at least one target sequence flanking said transgene 
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10 



and said target sequences being activated by said 
activator to eliminate said transgenes. 

36- A nucleic acid construct according to claim 35 
wherein the activating gene is a DNA recombinase gene. 

37. A nucleic acid construct according to claim 35 or 
claim 36 wherein the activating gene is ere recombinase 
and the target sequences are lox recombination sites. 



38. A method of selectively removing one or more 
transgenes from the genome of a transgenic plant 
comprising the steps of 
0 transforming a plant cell with a construct according 

15 to any one of claims 3 5 to 37; 

regenerating said transgenic plant from said plant 
cell; and 

recovering progeny from meiotic reproduction of said 
transgenic plant 

20 
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FIGURE3. 

-440 

TAAAATTAATTTCATTAGTGGATCCGCAAACAAATATTAGATTQG 

GCCTATATCCATC TATATTATTT 
-350 

AAATGCGCXITATCGTCCTATATGCATCCX^AATAATTAGTATACTC 

GGCTTATQGGCCTATATGCATTTGATTCTATOGATAAAATC 
-260 

TCAAATGTCTAATGTGCGCCGTTATGAA^ 

TTTTCACCTAGATTCCTTCTATTGACCGTCGATAGACGGATGATA 

-170 

ACTATGAQGTGGCATTATCGCAGCCATCAAACAAAGTCATGTATA 

ACAAACAAGAGCACACAAAOGAAAACAAATTCAGTTGCQ 
-80 

AAATTCAAATCAACGGAATTAGAATCACGCTTTCAATTCOGTAAC 

, _ + 1 

CCGCC^TTAAAAAfcCTTGAACCCTCGAAGCAAATCGAGCAAAGAT 

+ 11 

TTTCAAATI^CGAATTTCAAAATTCTATCTCTCTCACTCITCC^ 

GCTTAGAGAGTCTTAGAGCGAGAAAATGATGGCTTCTCTTAAQm 

+10 i M M A S L K 

AGTGATTGATCTCTCTCTTTCTCTC 

TTCTCCATTCATCGTTTTGGTTTAA^^ 
+ 191 

TAC CTGACTC GCTTCTTCTCG TTTTTATTTTGT^ CG ATGATC 

CTGATCTGCTTGTGTTCTTTC^ 
+281 A E E T S 

GCCAGATGCAGCTCG'ITGAGCGTGAAGAAAATGATGAAGACGAAG 
+326 QMQLVEREENDEDED 

ATCTATTIGAGATGATTGACAAATCTAAGATTTG^ 
L F E M I D K L 
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GTTAACACCGTTTATATGAGACAAAATCAGCTATGAGAT^^ 

1612 + + + + + + - 1671 

CAATTGTGGCAAATATACTCPGTTTTAGTCGA 

ATTAATTAAAAATAGTATAAATTAAATAArATAGTTCGATACACGAATA 

1672 + + + 4- +- 1731 

TAATTAATTTTTATCATATTTAATTTATTATATCAAGCT 

AATAGGCATACAAATTTGTCATACATGTTTCGATAITC 

1732 + + + + + +- 1791 

TTATCCGTATGTTTAAACAGTATGTACAAAGCTA^ 

AGTTTGATCTATAOGTATGCAAATTGAGAAGTACITC^ 

1792 + + + + + 1851 

TCAAACTACATATGCATAOGTTTAACTCTTCATGAAC^ 

CX1AACTTATCTTTTTGTTTTGGA 

1852 ♦ + + + + +- 1911 

GCTTGAATAGAAAAACAAAACCTAGTAGATAGCTTATGTTACCATGATATTA 

TTTTITTCTTCTTTTTCTTTAGTA^ 

1912 + + + + + +- 1971 

AAAAAAAGAAGAAAAAGAAATCATAGTTTTCGTTGCAATCTA 

TGATTGTGATGACTGATAGTCTGATAATATCATT 

1972 + + + + + 2031 

ACTAACACTACTGACTATCAGACTATTATAGT^ 

GTGTTCATATTTATAAATTCCAACCAACGT^ 

2032 + + + + + +- 2091 

CACAAGTATAAATATTTAAGGTTGGTTGCAATTAT 

ACAATATTATAATAAAATTAAAAAAACTACXSACT^ 

2092 + + + + + 2151 

TGrTATAATATTATTTTAATTTTIT^ 

TATGTTTTAAAAATAAAGTTCTTTAGTTCTAATA^ 

2152 + + + + + +- 2211 

ATACAAAATTTTTATTTCAAGAAATCAAGATTAT^ 

TATGTAAAAAGGTTITAGTACAAlTCTrTTTTC 

2212 + + + + + +1 2271 

ATACATTTTTrcCAAAATCATGTTAAGA 

TTTACTATTGATTTTTTTTAAAAATAAAATAACAA^ 

2272 4- + + + + +- 2331 

AAATGATAACTAAAAAAAATTl*ITATTTTATrGTT^^ 

TGATCGCAACTrAATTATAATT CT T LHUTr^ 

2332 + + +- + + 2391 

ACTAGCX5TTGAATTAATATTAAGAAGAAAAAAAAGAACCTTCrAATT^ 

CAATGTGGAACAAATAAATGTAGAAATATTGTTATC 

2392 + + + +- + +- 2451 

GTTACACX7TIK7ITTATTTACATCTTC 

AATATTTTCATATATACTTTTGA 

2452 + + + + + +~ 2511 

TTATAAAAGTATATATGAAAACTCGAAGACTACTATATTG?TCAA 

TTGTCXrTGTACTAATTTTTCTTTTGTTC 

2512 * + + - + 2571 

AACAGCACATGATTAAAAAGAAAACAAGTTCATACACTATTTTTATACAAOT 
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FI6URE4 (A) continued 

GAGTTATTATAATGGTACAAATATGTAGAGAGAATACATGAGAAGAGTTAAAAGAAGCA^ 

2572 + + + + + +- 2631 

CTCAATAATATTACCATGTTTATACATCTCTCTTATGTACT^ 

GCirAAGCCAACAGAGAGTGGATCCAAATGTTTG^ 

2632 + + + + + +- 2691 

CGAATTCGGTTGTCTCTCACCTAGGTTTACAACGAA 

CCACATTACTGCCACTGCTACATA^ 

2692 + + + + + +- 2751 

GGTGTAATGACGGTGACGATGTATATAACT^^ 

ATCGATGTCGATGATAAATTGATGATGATGGC^ 

2752 + + -»- + + +- 2811 

TAGCTACAGCTACTATTTAACTACTACTACCGA^ 

CTAGCTAGCTAGGACCATGTATATACATACATACATATATTA^ 

2812 + + + • + + 2871 

GATOSATCGATCCTGCTACATATAIOTATGTATG^ 

ACCTACC1TACAAACAGTATGGAGTTTACTAAAACO 

2872 -f + + + + +- 2931 

TGCATGGAATGTTTGTCATACCTCAAATG^ 

TTCGCAAGTGGGGATGAGTCTATGTAATAGAAGATGCA^ 

2932 + + + -i + + - 2991 

AAGCGTTCACCCCTACTCAGATACATTATCTTCT 

GTTTTTATTTAAAGAAATTAGTGTTTACIX^G 

2992 + + + * + +- 3051 

CAAAAATAAATTTCTTTAATCACAAATCACTC^ 

CCCATAAAAGCAAACCACTTCTCCTT^ 

3052 + + + + + +- 3111 

GGGTATTTTCGTTTGGTGAAGAGGAAGAAAAAT^ 

TTTTGAAAATITGAAGTGTACATTT^ 

3112 + + + + + +- 3171 

AAAACTTTTAAACTTCACATGTAAATCTC^ 

ATCTAAATATATAAACTCCAATTTAAAAT^ 

3172 + + + + + +- 3231 

TAGATTTATATATTTGAGGTTAAATTTTATTA^ 

GCCATTTOTTGGTCTTCATTTC^ 

3232 + + + + + +- 3291 

CGGTAAACAACCAGAAGTAAAGAGTACGAAACTAATCT 

AAATCA1X3TGCATAAACTAAGAAATAGCTAGCACA 

3292 + + + . ♦ + 3351 

TTTAGTACACGTATTK^TTCTTTATCGAT^ 

TACTATGTTCACTTTAAGAGAAAAAAAAACTT^ 
3352 + + + + + + - 34n 

ATGATACAAGTGAAATTCTCTTTTTTTTTGAW 

TATGATAATCAAAGTGCATATGTGAAGTGAGAGGCAACTGTAGACT 

3412 + + + + + +— 3471 

ATACTATTAGTTTCACGTATACACTTCACTCT^ 

CAAAGAAAATTTTTAAATATGAGAAAAAATTATATAAAAAGGTO 

3472 + + + + + +- 3531 

GTTTCTTTTAAAAATTTATACKrTTTT^ 

CTTTTGATATAGGGAGATTCGTTGAGCAT^ 

3532 + + — + + + 3591 

GAAAACTATATCCCTCTAAGCAACTCGTAGGTACACGAGAAAGTTAGCTC 
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FIGURE 4 (A) continued 

CTCHT^TCTAGCCAACCCACATATACCTT^ 

3592 + + + ♦ + +- 3651 

C^GATAGATCGGTTGGGTGTATATGGAA^ 

AAATCAATGTCATATAATATAATTAA<^ 

3652 + + + + +- 3711 

TTTAGTTACyvCTATATTATATTAATTCXn , ATATACGTATTTTTTA 

AGTCATX3TTACTTAAGGTCATGGTGTGTAAAAACATTGAT 

3712 + + + + + 3771 

TCAGTACAATGAATTCCACTACCACACATTT^^ 

TGAAGTGCTCTTAAAGTTATAACAT^ 

3772 - + + + + -+ +- 3831 

ACTTCACGAGAATTTCAATATTGTAGGCCA^ 

GTTTTTTACTCCAAATCAAATCAAGTQQ CT 

3832 + + + + + +- 3891 

CAAAAAATCAGGTTTAGTTTAGTTCAGCCAAGAAA 

TTTGAAAATATTAGCTATGATCTTAGCTTGGGTTTTTff 

3892 + + + + ♦ +- 3951 

AAACTTTTATAATCGATACTAGAATCGAACCC AAAAACAAAC^ CAATTCCTAGT AT 

TCTCTTTGTCAAATGACATGTGGTCTATATC 

3952 + + + + +- 4011 

AGAGAAACAGTTTACTCTACACCAGATATACA 

ATTGATTCXSACGACATTGGGAC^ 

4012 + + + + +- 4071 

TAACTAAGCTGCTGTAACCCTGAGGAGTGATGT^ 

GTTAATGX3CTTGTCACCATAAACTTTO 

4072 + + + + + +- 4131 

CAATTACCGAACAGTGGTATTTGAAAGT^ 

AGGTCTCACAATATATACAATTTCGAG^^ 

4132 + + + * + +- 4191 

TCXZAGAGTGTTATATATGTTAAAGCTCCCTA 

GGTAGAAATGTATAG TTTCrA GTAATAATAGAflATO 

4192 + + + h + +- 4251 

CCATCTTTACATATCAAAGATCATTATT^ 

AAAATTAATTTGATTAGTGGATCCXXAAACAA^ 

4252 ♦ + + + + +- 4311 

TTTTAATTAAACTAATCACCTAGGOCT 

ATTATTTTTATTTTTCTGTAATTTCAGTAA 

4312 + + + +• + +- 4371 

TAATAAAAATAAAAAGACATTAAAGTCATTTTACCC^ 

TAATTAGTATACTGGGOTTATCX3QCXn7VTATGCAT^ 

4372 + + + + + +- 4431 

ATTAATCATATGACCCGAATACCCGGATATACCT 

CAAATCTCTAATGT<XrGCCGTTATGAAGTGCA 

4432 + + + + + +- 4491 

GTTTACAGATTACACGCGGCAATACTTCACGTTCAC 

TTCTATTGACCGTCX^TAGACGGATGATAACTAT^ 
4492 + + + + + + - 4551 

AAGATAACTGGCAGCTATCT<XXrrACTATTGATACTGCACCCT 
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FIGURE4 (A) continued 

CAAAGTCATGTATAACAAACAAGAGCACACAAACGAAAAC^ 

4552 + + + + + +- 4611 

GTTTCAGTACATATTGTTTGTTCTCGTGT^ 

AATTCAAATCAACGGAATTAGAATCACXXTTTTCAAT^ 

4612 + + + + + +- 4671 

TTAAGTTTAGTTGCCTTAATCTTAG1«X3AAAGTTAA 

TCAACCXTCGAAGCAAATCGAGCAAAGA 

4672 + + +- 7 + + +- 4731 

ACTTGGGAGCTTCGTTTAGCTOGTTTCTAAA^ 

TCTCACTCTTCCAAGCTTAGA^ 

4732 + + + + 4779 

AGAGT>GAGAAGGTTCGAATCTCTCAGAA^ 



FIGURE 4 <B) 



4473 TlTIU'riCACCTAGATTCCTTCTATTGACCGTCX^TAGACGGATGATAAC 4522 

llllllllllllllllllllllllllilllllllllllllllltllllll 
1120 tttttttcacctagattccttctattgaccgtcgatagacggatgataac 1169 

4523 TATGACGTGGCATTATCGCAGCCATCAAACAA 4572 

IIIMIIIIIIIIIIllllllllllllllllllllllllllliMIII I 

1170 ta tgacg tggca t ta tcgcagcca tcaaacaaag tea tg ta taacaaaga 1219 

4573 AGAGCACACAAACGAAAACAAftTTCAGTTG^ 4622 

llllllflililllllllllllllllllllllllinillllllllllll 

1220 agagcacacaaacgaaaacaaattcagttgcggaacccaaattcaaatca 1269 

4623 AC.GGAATTAGAATCACGCTTTCAAT^ 4671 

II IIIIIIIIIIIIIIIIIIIIIMIIIHItlllllllllllllllll 
1270 acgggaattagaatcacgctttcaattccgtaacccgccattaaaaacct 1319 



4672 TGAACCCTCGAAGCAAATC 4690 

lllllllllllllllllll 
1320 tgaaccctcgaagcaaatc 1338 
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FIGURE 5(A) 

GAACATAAAATTTAAAAAATGACTCTTGGGGATAGCAAAAATA 
! + + + + + + 60 

CTTGTATTTTAAATTTTTTACTGAGAACCCCT 

TCnTCGTTTTCTCATGCGATATGACAGTAAAAAGGATA 

61 + + + + + + 120 

ACAAGCAAAAGAGTACXXTTATACTGTCAT^ 

GTTGTAGGTGTGTGAACATCATTATATATAACATTTATGAAAGC^ 

121 + + + + + + 180 

CAACATCCACACACTTCTACTAATATATAT^ 

CAATGGCCGGACATTTATCCAAAGCAC^ 
181 + + + + + + 240 

GTTACCGGCCTGTAAATAGGTTTCGTGTGTTAGACGTGTC 
CAATGACLAAAATAAAAAGAAGGGAGGACATAAA 

241 + + + + + + 300 

GTTACTCTTTTATTTTTCTTXXXTT^ 

TCCGCAAAAAAAAAGATAGCACACIX^TGCAAAAACGAGA^ 

301 + + + + + + 360 

AGGCGTTTTTTTTTCTATCGTGTGAGTACG^ 

AAAAACAATAATGCCTCTGAAGGAGAGCA^ 

361 + + + + + + 420 

TTTTTGTTATTACGGAGACTTCCTCTOGTGTT^ 

AAACTTAGATGrR^TTAGATAATACTCATCrrAAAA 

421 + ^ + + + + 480 

TTTCAATCTACACGTAATCTATTATCAGT^ 

TTAATATAC CITCTTT CGATCAAATAAATAAAATATAC^ 

481 + + + + + + 540 

AATTATATGGAAGAAAGCTAGTTTATTTAT^ 

CTAAAGGACTTGAAAATGACAO^GAAAAATAAAAT 

541 + + + + + + 600 

GATTTCXTIXSAACTTTTACTGT ^ 

CAATAATAGCATCXSCTTAGCCuI^ 

GttTATTATCOTAGCGAATCGGCA^ 

CXXTTCCAACAGATAACATCAAGACAA^ 

661 + + + + + + 720 

GCGAGOTTGrcTATTGTAGTTCT 

GCAGTATAGTGGAACTACCACAAATCTAAAAGTTGTATT^ 

721 •*- + + + + + 780 

CX5TCATATCACCTTGATGGTGTTTAGAT^^ 

AGACTTTC7ITAAATCTCAGTCAACTGAGATTC 

781 + + + + + + 840 

TCTGAAACAATTTAGAGTCAGTTGACTCTAAGTrACGTAGGTA 

AACACTTACATAAATAGTTTTTATTGAAGCAATCATATATT^ 

841 + + + + + + 900 

TTGTGAATGTATTTATCAAAAATAACTTCGTTA 

TAGAAATTACACATATGATCACCGT 

901 - + — + + + * + 960 

ATCTTT AATGTGTAT ACT AGTGGCATGCTCTGT^ L m i\S J ' 
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FIGURE 5(A) continued 



AAAAAATTCACAAGAAGTAAAACTACTATTCCATCCT^ 

961 ♦ + + + + * 1020 

TTTTTTAAGTGTTCTIX^TTTTGATGA 

TGAGCAGTCCACAAGCGTTTTTCACXXTrCACCATC 

1021 + + + + + 1080 

ACTCGTt^GGTGTTCGCAAAAAGTGGCAGTGGTAC^ 

TCATAGCAAG&TCTCGGftAAGCACCACXA^ 

1081 + + + ♦ + + 1140 

AGTATCGTTCTAGAGCCTTrcGTGGTQ^ 

GGTCATGGTAAGCTACGAGACCAGGTCTTTGTGGC^ 

1141 + + + + + + 1200 

CCAGTACC^TTCGATGCTCIXSGTCCAGAAACACCOT 

GTCACCAGATCTCXSATQGAG^ 

1201 ♦ + + + + + 1260 

CAGTGGTCTAGAGCTACCTCAATTC 

GCCCCXnTCAAAATTTTGTATTTO 

1261 + + + + + + 1320 

CGGGGCAAGTTTTAAAACATAAAGCTTACCCGGAT 

CTTGGTGTTA 

1321 + + 4. + + + 1380 

CGCAGCCTCAGTGATCCACCCGCACCTC^^ 

13 81 + + ♦ + + + 1440 

3TTGTTTGGAGGGGAGAGGGGGTOGGAGGC 



1441 + + + + + + 1500 

TCCACCCTXTTCCTUCTCOK^ 

1501 + + + + 1- + 1560 

AGCH\3GCXX3AAGOT3AGGTXX^ 

CITCTCCTATGWSGCTG^^ 
1561 — + + + + + + 1620 

ACGAGQCXXXSGCAGCTCCAGCTrCA 

1621 + + + + + + 1680 

TGCTCCCGCCCGTCX3AGC7^ 

TCGAGTCCATTGACAAGTCTACGTTC^ 

1681 + + + + + + 1740 

AGCKZAGGTAACTGTTCACATGCAAGCTGCTC 

CGCCGCCCCCGTCTCCGGCGTGCGTCGCGCXXrGT^^ 

1741 + + ♦ + + + 1800 

GCG(XXXX3GGCAGAGGCCGCACGCAGCGCGCGCA 

CX^GTTGCXTITCOGGGTGCCTCCGCCTC^^ 

1801 - + + + + + + i860 

GCGGCAACGCAAGGCCCACGGAGGCGGAGAGATTCAAGCGCGCAA^ 

TXXriXTTTTGGAGGTAGACTTGGTGa^ 
1861 + «- + * + + 1920 

ACGACAAACXTTCCATCTGAACCACGCCTAAA^ 
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FIGURES (A) continued 

GGTGCGTGCAQCGTAGTCGCGTGCTCG^ 

1921 + + + + + + 1980 

CXIACGCACGTCGCATCAGCGCACGAGCACAACGCAACC^ 

GTGCAATCTGGGGAGCAACTAGCGCXXrrTTGGTAGCCT 

1981 + + 4- + + + 2040 

CACGTTAGACCCCTCGTITGATCGCGOGAAACCATO 

AGTClX^CTTAGOCCCATTrTCTGCCAACA 

2041 + + + + + + 2100 

TKIAG ACTTGAATCGGGGTAAAACACGGTTGTACX^^ 

CTGTAGAATCCITTCGAATTCCAAACrAGATATAGTT^ 

2101 + + + + + + 2160 

GACATCTTAGGAAAGCITAAGGTT^^ 

AACTTTGACCAATTACGAACGCGTTTTCGTGC^^ 

2161 — + + + + + + 2220 

TTGAAACTGGTTAATGCTTGCGCAAAAGCACG^ 

CTTACTCATCTGTTQCACATTTTTTTCTGAATGTTC 

2221 +- + +• + *- + 2280 

GAATGAGTAGACAACGTGTAAAAAAAGACTTACAAGTAGACAAGACAAGGTAT^ 

GTAGCTTCTCnX5AGCTTAGCAAATTTGA 

2281 + + + + + + 2340 

CATCGJyVGACACTOGAATCGTTTAAACTA^ 

CTGTGTTTTGTTTGCGTGATCCAGTTQG^^ 

2341 + + + + +- + 2400 

GACACAAAACAAACGCACTAGGTCAACCAAGAA^ 

ATTTGAATCCTAAACCAAGATOIX^^ 

2401 + + + + + + 2460 

TAAACTTAGGATTTGGTICTACACTAGGTGGG^ 

GTCCAAAOKSTGATCACGCAGGGG^ 
2461 +■ + + + + + 2520 

GGGATCTACACTTGCAATCXXXr^ 

2521 + + + + + + 2580 

CCCTAGATGTGAAOglTACCGGACTACTACGTO 

TAAATCIX^TCXXTTTCIXXTTA 

2581 — + + + + -i + 2640 

ATTTAGACCTACGGAAGa^GGAATTTTAAACACCAA 

ATGACGCATGACTGCTATTAGTTX3TTTTATOT 

2641 + + + + + +■ 2700 

TACTGCGTACTGACGATAATCAACAAA^ 

GTTTCAATATCAAlXrrATGGATrcATCTTGCAGA^ 

2701 +- — + + + + + 2760 

CAAAGTTATAGTTACATACXITrAACTAG 

GAAGCAAAGGTTGATAAGATCTGCGAGGCTGCT^ 

2761 > + + + +• +• 2820 

CTTCGTTTCCAACTATTCTAGACGCTCCGAC^ 

TCTTTGCATTGATTCTGATTTAGAACTTCTC 

2821 + ♦ + ♦ + + 2880 

AGAAACGTAACTAAGACTAAATCTTGAAGACACGGrTATAATAAGAGAAACOT 
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FIGURE5 (A) continued 



GGTTTACAACTTCTGTGCCATCCGTXXTTATGTQATAGAG 

2881 + + -i- + + + 2940 

CCAAATGTTGAAGACACGGTAGGCACGATACACTATCT^ 

TTTGATTTTCAGAGTCAGGGTTTCATGACAGGAAGTC 

2941 + + + + + + 3000 

AAACTAAAAGTCTCAGTCCCAAAGTACTGTCCTTCAOT 

AGGAGCTAAGOTACTGATGAAGGACGATCATACAGTTTAGTG^ 

3001 + + + + + + 3060 

TCCTCGATTCGATGACTACTTCCTGCTAGT^ 

TACTATGCTAAACTT AC y IXj l l AAAC AGTAGT i'l'G*ri , ri ,, l tJGAATCTGCTTGAGTGCTCTCA 

3061 + + + + + + 3120 

ATGAT ACGATTTGAATG ACATTTGTC ATC AAACAAAAAC CTTAG ACXJAACTC ACGAGAGT 

T^ACTTCA.CCTGTGCACTCCCCXX5T^ 

3121 + + + + + + 3180 

AATGAAGTQGACACGTGAaSCXXrCAA 

TCTGTTGTCCGGATTACCACTGGGAGCCAAGCACTT^ 

3181 + + + + +■ + 3240 

AGACAACAGGCCTAATGGTGACCCTCGGTTOSTGAA^ 

TGTCGTCCTIGATTCICTTICTGA 

3241 + + + + + ♦ 3300 

AGAGCAGGAACTAAGACAAGACTAATAAGG^ 

TCCATAATTTGAAGGAGGGATTGAAACTCTCTGTATC 

3301 + + + + + + 3360 

AGKyTATTAAACTTCCTCCXTrAACTT 

3361 + + + + + 3420 

GAGTTCATTTATAAGGCCAAIIXXaAAAA 

CAATATATTAGTTGTATATCTGATATCTATG^ 

3421 + + + + + + 3480 

GTTATATAATCAACATATAGACTATACATACTAAACACGCTATAAA 

TTAGTGCTAGTAATTTTTTAGTGOGTGTTGCTAA 

3481 + + h + h i 3540 

AATCACGAIX^TTAAAAAATCACtXACAACCA 

TTTTGTACraiCTAGTCTTCXA 

3541 + + + + «■ + 3600 

AAAACATGACAGATCAGAAGGTtrroCATAA^ 

GTTATTCKTOTGATGOTAACGCTGAATACTG^ 

3601 + + + + + + 3660 

CAATAACGACACTACX^TTG<X3ACTTATGACCTATG 

3661 + + + ♦ + * 3720 

ATACATCTAGTTATAATAATAAGTG^^ 

TTTTTCCATGAGCACAATTGTAACyV^ 

3721 + + + + 1- + 3780 

AAAAAGGTACTXXTIK7ITAAC7VTTGTC 

CTCAATTCAGTTCGAACTTTCTAGCGTTA 

37B1 + + + + + + 3R40 

GACTTAAG^TCAAGGTTGAAAGATCXXrAATTTTAGGTATAAT 
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FIGURE 5 (A) continued 

CAGGGAAGACCCAGTTGGCTCATACTCTTT^^ 
3841 + + + + + + 3900 

GTATTTCATCGATGTAACCTCACTACAACAG^ 

3901 + + + + + + 3960 

CATAAAGTAGCTACATTGGAGTGATGTTGTCTTAGOT 

AAATGTATAACAGCTTCCACTCCACATGCAIIKX^ 

396! + + + + + + 4020 

TTTACATATTGTCGAAQGTGAGGTGTACGTACCACCC^^ 

CACTGAGGGAACATTGTATCCITTGGA 

4021 + + + +• + «■ 4080 

GTGACTCCCITOrAACATAGGAAACCTAAQGAAT^ 

TCAATTCATITACATTTCTATATTTCTCGAA 

4081 + + +■ + + + 4140 

ACTTAAGTAAATGTAAAGATATAAAGAGCTTTGACAAAGGA^ 

CCTGAACGCATTGTGCC^riX^^ 

4141 + + + + + + 4200 

GGACTTGCOTAACACGGTTAACGACTCTCTAAAC^ 

AATGTATGGCTCCTATTACATCTCTCTTGACCCA 

4201 + + + + + + 4260 

TTACATACCGAGGATAATCTAGAGW 

TGTTTAATTCGTGATCTTTCTGTTTT 

4261 + + + + + + 4320 

ACAAATTAAGCACTAGAAAGACAAAATCTAGTATATG 

CCAGTACAACITACTCCTGGGCC^^ 

4321 + h -»- + + + 4380 

GGTCATGTTGAATGAGGAOXGGAA^ 

GCTACGCATGACTTTGCTGCCATGTAAATT^ 

4381 + + .+ + + + 4440 

CCATGOGTACTGAAACGAOGOTACATTTAAATGTTAACT 

TGAl irmVlTl^ CTTGGA^ 

4441 • + + + + + 4500 

ACTAGAAACAAACTGAACCTTTACTATCT^ 

TGATTTCAGTOGTAGGGX?TGAACTTCCA^ 

4501 + + + + + ♦ 4560 

ACTAAAGTCACCAIXXCCACTOWVCGT^ 

TACGTXSAAAAAATCAAGCAACTIX^GGAAGTACT 

4561 + + + + + + 4620 

ATGCA Cl " ri4 * l * r AGT '1 0 yi ' l X3AGTCCTTCA^ 

4621 + + + -f- + + 4680 

ATTCACTACACaAGACaCTGACGT ^^ 

GATOXTGAGGAGTTCMTGTTGCAGTCT 

4681 + + + + + + 4740 

CTAACGACTCCTCAAGTrACAACXnCACATCTAGTGGTT^ 

AATCCTATCTTTTCCAAGAAAGAGCT 

4741 + + 4766 

TTAGGATAGAAAAGGTTCTTTCTCGA 
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FIGURES (B) 



AAATTGIKSAAQCACTTTCTATAAAAAT^ 
1 + + + + + + 60 

TTTAACACTTCGTGAAAG^^ 

GTATCATATAATATTTATAIXXX^TGATG^ 

61 + + + + + + 120 

CATAGTATATTATAAATATACGCTACTACTAATCCGCCTGGAGAGATGCACTC 

ATTTGAATATTTTACTTAC X * XUXj1VTO 

121 + + + + * + 180 

TAAAC1TATAAAATGAATGGAACAAAACAAAAAGGAAAGATTTTAACT 

AAGTGGTTTGGATAGATAAAAAATAATTTTGGACCTA^ 

181 + + + + + + 240 

TTCACCAAACCTATCTATTTTTTATTAAA 

GT AATAAGATTCTAAATGTTAAAACCAATAACGGATATC TTAGATCCACCTTTTTCATAA 

241 + + + + + + 300 

CATTATTCTAAGATTTACAATTTTGCTTATTGC^ 

TAGTCCACCCATCATCTCAAAATTTTCGA 

301 + + + + + + 360 

ATCACGTGGCTAOTAGAGTTTTAAAAC 

TTTATATGCTTTATGCTGAATCA1K3ATATCATT^ 
361 + + + + + + 420 

AAATATAOSAAATACGACTTAGTACTATAGT^ 

TTGTCTGATTTACKX»GCTTC^ 

421 + +■ + ♦ + + 480 

AACAGACTAAATGACGTCGAAGGCTX^TCATACT^ 

ATTQATACTGAGGGAACATIX?ITTCCTTGCTAA 

481 + + + + + + 540 

TAACTATGACTCCCTTGTAACAAAGGAAC^ 

TAGCACCTATTACTCTCTTCATTAACT 
541 + + + + + + 600 

ATCCTGGATAATGAGAGAAGTAATTC 

CTTCCATTTTATCTTTTTTCCTCAACCA^ 

601 + ^ 4 + + + 660 

GAAGCTAAAATAGAAAAAAG<3AGTTrGGTTCGC 

OCATTGCTGAAAGATTTGG^TGG 

661 + + + + + + 720 

GGTAACGACTTTCTAAACCTTACCTGCXSACC^ 

TACACCCACATTTAATCATXTTACTGC^^ 

721 + + + -t- + + 780 

ATGTGGGTGTAAATTAGTAGATGACGA^ 

TTCCTTATTATGTIKIMX^G^ 

781 + + + + + + 840 

AAGGAATAATACACTAGTCTAGTAAATACGAGCGCGTATGTGTATA 

CCTGCTTCTTGGTCTGGCAGCAAAAATG<XriX^^ 
841 + + + «. + + 900 

GGACGAAGAACCAGACCGTOSTT^ 

CACATCATCTGCTTTATCTTGAATAAGACCATTACTO^ 

901 + + + ♦ + + 960 

GTGTAGTAGACGAAATAGAACTTATTCTGGTAATG^ 
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FIGURES (B) continued 

AATTTTACTTGCAGATTGTTGACTCT^^ 

96! + + + + 4- + 1020 

TTAAAATGAACGTCTAACAACTGAGACACTAACGAAATA^ 

GAGGAGAGCTTGCAGACCGTCAGGTATAACTAAAT^ 

1021 + + + + + + 1080 

CTCCTCTCXSAACGTCTGGCA 

AAAAACCTATCTCTGATATTTATCTGTGTTG 

1081 + + + + + + 1140 

TTTTTGGATAGAGACTATAAATAGACACAACT^^ 

GACTITTTCTGAATGCTrACGCCTI^^ 

1141 + 4- + + 4- + 1200 

CTGAAAAAGACTTACK1AATGCGGAAGAACGGTAAAG 

CACGATTGATAAAGATAGCTGAGGAATTTAAT^^ 

1201 + + + + + + 1260 

GTGCTAACTATTTCTATCXSACTCXriTAA^ 

TACAATCXIAACtZTITGCGTTTTACA 

1261 + 4- + 4- + + 1320 

ATGTTAGGTTGGAACGCAAAATGTTTTACATAAACAAATC 

GTGTATAGATACACTGCTACTTCCTAAGTGT^^ 

1321 + f + + + + 1380 

CACATATCTATGTGACGATGAAGGATTCACAGCrAC^ 

CCTAIX^AATACTCTTTGCATGGCACTTTC 

1381 + + + 4- +■ + 1440 

GGATACATTATGAGAAACGTACCGTGAAACGTTTG^ 

TACTGTATATTGATTGCCTTTCCTCGGACATTGA 

1441 + + 4- 4- 4 + 1500 

ATGACATATAACTAACGGAAAGGAGCCTGTAACT^ 

TTCTCTTATTATTGAATGCAATTCCATC^^ 

1501 + + 4- + + + 1560 

AAGAGAATAATAACTTACGTTAAGCTAGA^ 

CAGATCXTAAAGAAACCAGCAGGAGGCC^ 

TGTTCAGGAAGGGCAAASGAGAACAGCGT^^ 

1621 + + 4- 4- 4- + 1680 

ACAAGTCCTTCCCOTTTCCTCT^ 

CAGAGTCTGAAGCGATATCCTTTTTC^ 

1681 4- h + + + 4- 1740 

GTCTCAGACTTCXXTTATAGGAAA 

GATTGTGATGTTTAGAGAGAAGAAGAAAGCTCC^ 

1741 + 4- 4- 4- 4- + 1800 

CTAACACTACAAATCTCTGTTCTTCTTTC 

TCTTCTTTCCACIK^GTCGAAATTAGT^ 

1B01 4- 4- + + 4- 4- I860 

AGAAGAAAGGTCACTCAGCTTTAATC^ 

TCCTTGACGAAAAACACATCTTCCAGATA^ 

1861 4- + 4- + 4- + 1920 

AGGAACTTGCTTTTTCTGTAGAAGGTCT^^ 
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FIGURE 5 (B) continued 

GAAAACCAGCITOGACTTGCAACAGAGAGCTGCTTAT^^ 

1921 + + + + + + 1980 

CTTTTGGTTCAACCTGAACGTTC 

GCTTACATTTTGAGGTTAGCTATAAATATATTAATrTTGCATA^ 

1981 ♦ + + + + + 2040 

CGAATGTAAAACTCCAATCGATATTTATATAATTAAAACGTAT^ 

AATCTTITCTTACACTTATCTTGGTTGAGTGOT 

2041 ♦ + + + + + 2100 

ITACAAAAGAATGTGAATAGAACCAACTCACGACACAATA^ 

TTXXrrCTTAGAGGATGACAATGTCAAGA 

2101 ♦ + + + + 2160 

AACGAGAATCTCCTACTGTTACAGTTCTC^ 

CACGAAATGTTGACATCX^AATAGC^ 

2161 + + ♦ + + + 2220 

G ^ t X T lT ACAACTCTACGTTATC 

TAGCATGCAATTTTTAGTCCTGATAGT^ 

2221 + + + + + + 2280 

ATCGTACGTTAAAAATCAQGACTATCAACTAGGTITAGTAGATC^ 

TCTCCTAATACXTTGTCCTACCAGCATATATGTGTAA 

2281 + + + + + + 2340 

AGAGGATTATGGACAGQATGGTCGTATATACACATTGTACGGT^ 

TTCAAAACATGGGAAATTTGCAAGGGTAAAAGGAAAA 

2341 + + + + + + 2400 

AACTTTTOTACCCTTTAAACGTTCra 

AGATAAGCAAAGCXXXTCAATCGAAGACTGrrTATTTA 

2401 + + + + + + 2460 

TCTATTCGTTTCGGGGACTTAGCTTC 

GAAAATGGCTTAATTCAGTTTTAGAAA 

2461 +• + + ► + + 2520 

CTTTTACCGAATTAAGTCAA^ 

CATATCTTCCCTC^CTTTTTAAOT 

2521 < + + + + + 2580 

GTATAGAAGGGAGTCAAAAATTGTAAAQAGATC7ITTA 

AGTTATCAACTAGTIOATAACAGAAAGA£?rTG 

2581 * + h 2612 

TCAATAGTTGATCAACTAT 
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FIGURE 6 
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